many open files can use lots of memory
If we set the max_open_files very high (and of course make sure the OS is setup to support this many files), then loading in, for instance, 50,000 textures can use up 5.5GB of unaccounted memory even if the texture cache stays small. I believe the reason for this is that each open texture stores data that is not counted against the texture cache limit. For instance, if we look at OpenEXR (note, similar overhead was seen for both tiff and exr), we see in its close() method that closing exrs release a bunch of memory:
bool
OpenEXRInput::close ()
{
delete m_input_multipart;
delete m_scanline_input_part;
delete m_tiled_input_part;
delete m_deep_scanline_input_part;
delete m_deep_tiled_input_part;
delete m_input_scanline;
delete m_input_tiled;
delete m_input_stream;
init (); // Reset to initial state
return true;
}
I don't know if this makes up the source of the majority of the overhead or if maybe the overhead comes from some other bookkeeping in OIIO, but at least it helps to indicate that unaccounted overhead does exist in OIIO.
I'm not sure what the proper way of handling this is. One option is to factor this overhead into the texture cache. The tricky part would be to figure out when it's better to close a file handle and when it's better to evict cache data (maybe you'd want to close files that have the least amount of data in the cache or files that have been open longest without being used?). Another option is to let OIIO continue to use up the extra memory, but at least have it tabulate this memory overhead and include this memory usage in the logs it produces so that users can know what is causing the 5.5GB of unaccounted memory.
I would go even farther and say that most of the unaccounted memory is probably data held internally to the underlying format libraries (libIlmImf, libtiff, etc) which are hidden behind opaque pointers from OIIO's point of view.
So problem 1 is, how OIIO can track this. I'm betting that (unfortunately) those libraries don't have an easy way to query how much internal data is being used by each open TIFF or OpenEXR file, say. Perhaps we can come up with an estimate, even if not exact? I assume the estimate of "open file memory overhead" differs for each format, perhaps with each version of the underlying library, and possibly varies by file as well (do files with big headers with lots of metadata take more RAM within libIlmImf? is there a one-tile or one-scanline buffer kept internally? various compression tables or other data constructed on the fly?).
Problem 2 is, if we could track the memory (or a decent estimate), how do we manage cache requirements? Should we have a single "total memory budget", and if so, how do we adjudicate the tradeoff between reclaiming tiles versus closing open files for memory sake? That is, when we've exceeded memory, how do we know whether it's smarter to reclaim a tile or close an open file? I can imagine a bad choice there having horrible performance consequences. Or do we have separate "tile cache memory limit" versus "open file memory limit" options? Seems like the file memory is much closer in spirit to our existing policing of the maximum number of open OS file handles -- maybe those two can sort of be combined (i.e., close a LRU file if we're out of OS handles or out of file memory budget, and that's a separate resource from tile cache memory for pixels.
I'm open to suggestions. Thus far, we haven't really considered this issue, although when spelled out it's easy to imagine how it can become significant once there are thousands of textures in flight.