As discussed live yesterday and after having introduced PatternInformation next
to PatternCollectionInformation (issue892), we would like to keep patterns and
their corresponding PDBs together. This will allow making changes to a pattern
collection, such as sorting or removing elements, while keeping the list of PDBs
valid, which is also important to handle issue785 (consistent normalization of
pattern collections) easily.
The current idea is as follows: PatternCollectionInformation stores a list of
PatternInformation objects, which are a wrapper for pairs of patterns and PDBs.
(To actually have this light-weight wrapper, we should get rid of TaskProxy in
PatternInformation and let callers to "get_pdb" pass the task proxy in.) Instead
of using shared_ptr for both the pattern collection (list of patterns) and
individual PDBs insides a vector, we should consider to store the plain datatype
in the information objects. This way, they act as an interface for transferring
the patterns (and possibly PDBs and max. additive subsets) from the generator to
the user. The user would then extract the things it needs to store, thus taking
full ownership of patterns and PDBs.
This would still leave us with storing patterns twice: as the pattern of
interest (we might not have a PDB for it yet), and within the PDB, which it
needs to compute the hash function. I don't see an easy way around this except
for sharing the pattern of a PDB via shared_ptr. I don't know if that is a good
idea. What do you think?
|