Re-added the string cache, disabled by default (bonus: also thread-safe)
I added a string cache to DAST in August last year, which kept down memory fragmentation when it was used repeatedly. It was never pulled into the main branch and that branch is now very hard to merge. I've re-implemented it.
It's controlled by a define, which is off by default, so behaviour is unchanged unless you explicitly turn it on. If you turn it on, attributes and valued syntax nodes share strings. This greatly reduces memory fragmentation and also memory usage. The actual changed code is quite small so it should be easy to maintain or keep in the product.
I investigated using injection or some similar "clean code" behaviour to control this, instead of a define, but it's hard when also keeping it performant. Since not much code is affected, ifdef-ing the required places is cheap and hidden from the external user.
The string cache also has basic thread safety added, where the Add and Get operations hold a critical section while modifying the values. This too is controlled by a define, but it's on by default. It hasn't been extensively reviewed, but I couldn't see any major logic errors.
Thanks, @vintagedave Let's wait a couple of days, if everyone is happy, I'll merge.
Btw, I'm asking because I've never tried. You use class as a dictionary key. Does it work? I afraid it could use pointer, not a content.
Great :)
I also have another branch (not done yet) for using a memory pool for all syntax nodes, using this technique: https://parnassus.co/custom-object-memory-allocation-in-delphi-bypassing-fastmm-for-fun-and-profit/ Is that of interest too?
Re dictionary - is that TStringCache.FStringToId? It would use the pointer, but I construct the dictionary with a TStringRecValueEqualityComparer instance which is use for the comparisons. It compares by the string value in each TStringRec only.
Ah, OK. I see :)
Do you think memory fragmentation is an issue?
For me it is, but that only occurs when using DAST a lot, and under memory pressure, eg in the pre-XE8 IDE. It might be for you for FixInsight over a large project, for example. I think FI still runs DAST a lot less than Navigator or Bookmarks do, which basically re-parse after a short delay after every single code change.
Can't this be solved way easier?
type
TStringInternPool = class
private
fStrings: TDictionary<string,string>;
public
constructor Create;
destructor Destroy; override;
function Intern(const s: string): string;
end;
{ TStringInternPool }
constructor TStringInternPool.Create;
begin
fStrings := TDictionary<string,string>.Create;
end;
destructor TStringInternPool.Destroy;
begin
fStrings.Free;
inherited;
end;
function TStringInternPool.Intern(const s: string): string;
begin
MonitorEnter(fStrings);
try
if not fStrings.TryGetValue(s, Result) then
begin
Result := s;
fStrings.Add(s, Result);
end;
finally
MonitorExit(fStrings);
end;
end;
Yes, though the current code allows usage tracking - you can get stats out about how effective it is.
On 28 April 2016 at 18:44, Stefan Glienke [email protected] wrote:
Can't this be solved way easier?
type TStringInternPool = class private fStrings: TDictionary<string,string>; public constructor Create; destructor Destroy; override; function Intern(const s: string): string; end;
{ TStringInternPool }
constructor TStringInternPool.Create; begin fStrings := TDictionary<string,string>.Create; end;
destructor TStringInternPool.Destroy; begin fStrings.Free; inherited; end;
function TStringInternPool.Intern(const s: string): string; begin MonitorEnter(fStrings); try if not fStrings.TryGetValue(s, Result) then begin Result := s; fStrings.Add(s, Result); end; finally MonitorExit(fStrings); end; end;
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/RomanYankovsky/DelphiAST/pull/172#issuecomment-215472223
Stefan's approach seems to be less intrusive for me. I think it can be improved by adding a method that will output statistics. We can read string's ref counter for usage tracking. What do you think @vintagedave ?
Yes, I think it's much better. @sglienke told me he submitted a patch where strings are interned in the lexer itself, which means it's much lower-level too. I'd go with that option as much better than mine.
I don't see his pull request :)
Look at the feature/stringinterning branch of my fork to see my approach - I did not submit a PR because this is just to show the concept.
So everyone agree to take Stefan's implementation of string cache?
For me, yes.
My string pool implementation was taken into master long ago - this can be closed