flixel icon indicating copy to clipboard operation
flixel copied to clipboard

Performance of OpenFL vs. Flixel+OpenFL

Open giuppe opened this issue 2 years ago • 6 comments

I ran both OpenFL's BunnyMark and Flixel's FlxBunnyMark on the same machine, adding bunnies until the fps dropped under 60. Windows-HXCPP-release. For FlxBunnymark, the options were: No Collisions/No shaders/Step: Variable/On-Screen. I also disabled the angularVelocity because the bunnies in the OpenFL version are not rotating.

FlxBunnyMark: 27k bunnies

OpenFL BunnyMark: 240k bunnies

27k bunnies for Flixel+OpenFL (at 52fps) vs. 240k bunnies for OpenFL (at 54fps).

Please forgive me if there is something obvious that I'm missing: both benchmarks are "official", I'm using the default configuration for each project, so I'm guessing they are doing the best they can to show the correct numbers. OpenFL's BunnyMark is using drawQuads, too.

Using the Visual Studio profiler there seems to be no discernible bottlenecks (as in: there are no obvious inefficiencies in Flixel that are eating the cpu time), just, the end result is that it's taking a lot more time to render.

Is Flixel unwittingly creating more work than it should for the rendering pipeline? Or is OpenFL missing some optimizations that overwhelmingly affect Flixel?

I know that there is more than raw performance to appreciate, but still: it's the same rendering engine, the numbers shouldn't be so different.

How can we approach the issue to gather more data, maybe find the root cause? Does anyone have any pointers?

giuppe avatar Dec 06 '23 01:12 giuppe

first thing I notice, the bunnies in the first image are rotated slightly, where they are all upright in the second, this might be preventing batch drawing. ~~Also how are you rendering the openfl standalone test?~~ Edit: Oh, i see. i forgot openfl had a bunnymark demo

Edit 2: The other difference i see is the UI overlay in Flixel, does hiding that improve performance? I doubt it will

Geokureli avatar Dec 06 '23 16:12 Geokureli

the bunnies in the first image are rotated slightly

yes, it's the initial angle, I forgot to disable it. It seems to make a small difference (+1000 bunnies):

immagine

Oh, i see. i forgot openfl had a bunnymark demo

Exactly, I just did openfl create BunnyMark and then openfl test windows.

Without the UI we gain another 1000 bunnies:

immagine

(I also used the openfl.display.FPS object to avoid using FlxText).

Another 1000 bunnies by disabling the background:

immagine

In any case, even with these changes, the fps counter goes under 60 at 25000 bunnies.

giuppe avatar Dec 09 '23 16:12 giuppe

yeah the difference is still massive, and it's worth doing a deep dive into this, thanks for checking those loose ends though!

Is the openfl test using Bitmap instances, are they both ending up with some gl-batch draw? I think flixel is rendering sprites to a Graphics buffer, I've always wondered if that could be omitted

Geokureli avatar Dec 09 '23 20:12 Geokureli

OpenFL BunnyMark is doing this, removing unnecessary code:

public function new(){
	// ... other initializations ...
	
	var bitmapData = Assets.getBitmapData ("assets/wabbit_alpha.png");
	tileset = new Tileset (bitmapData);
	tileset.addRect (bitmapData.rect);
	
	// ...

	indices = new Vector<Int> ();
	transforms = new Vector<Float> ();
}

private function addBunny ():Void {
	var bunny = new Bunny ();
	bunny.x = 0;
	bunny.y = 0;
	bunny.speedX = Math.random () * 5;
	bunny.speedY = (Math.random () * 5) - 2.5;
	bunnies.push (bunny);
		
	indices.push (bunny.id);
	transforms.push (0);
	transforms.push (0);
}

private function stage_onEnterFrame (event:Event):Void {
	for (i in 0...bunnies.length) {
		// ... recalculates bunnies position ...
		transforms[i * 2] = bunny.x;
		transforms[i * 2 + 1] = bunny.y;
	}
	
	graphics.clear ();
	graphics.beginFill (0xFFFFFF);
	graphics.drawRect (0, 0, stage.stageWidth, stage.stageHeight);
	graphics.beginBitmapFill (tileset.bitmapData, null, false);
	graphics.drawQuads (tileset.rectData, indices, transforms);
}

So it's doing a single drawQuads call per frame, with all the bunnies. It can do this because all the bunnies have the same bitmapData. I was wondering if this is giving it those insane numbers, and so I put the drawQuads code in the bunnies loop (something more similar to having a different bitmapData for each sprite):

private function stage_onEnterFrame (event:Event):Void {
	graphics.clear ();
	graphics.beginFill (0xFFFFFF);
	for (i in 0...bunnies.length) {
		var transforms = new Vector<Float>();
		var indices = new Vector<Int>();
		
		// ... recalculates bunnies position ...
		
		transforms.push(bunny.x);
		transforms.push(bunny.y);
		indices.push(0);
		
		graphics.drawRect (bunny.x, bunny.y, tileset.rectData[2], tileset.rectData[3]);
		graphics.beginBitmapFill (tileset.bitmapData, null, false);
		graphics.drawQuads (tileset.rectData, indices, transforms);
	}

}

AFAIK this is more like how Flixel's FlxCamera.render() works: each item has its own drawQuad(). and the result is:

immagine

If I'm understanding this correctly (and barring errors in my implementation), this would mean that drawQuads is itself relatively slow and Flixel is already optimizing it a lot...

OR shaderFill is faster than bitmapFill, I'll try.

giuppe avatar Dec 10 '23 00:12 giuppe

I changed bitmapFill to shaderFill in OpenFL's BunnyMark and it almost doubles the number of bunnies:

immagine

But still far from FlxBunnyMark 25k bunnies

giuppe avatar Dec 10 '23 15:12 giuppe

Seems related to https://github.com/HaxeFlixel/flixel/issues/3005

Geokureli avatar Jan 13 '24 16:01 Geokureli