Nosying Jendrik here because this affects the scripts and everyone else since I
know all of us stumbled over this at some point. As always, unnosy yourselves at
will.
We need some way of computing proper IPC-2008-style quality scores, i.e., w.r.t.
some reference result. This could go, but doesn't have to, go hand in hand with
developing a proper benchmark repository that other people than us could use.
Maybe we can discuss a first cut for this during the next meeting of Jendrik,
Gabi and me.
A good first step would be a simple database (in the widest sense, e.g. could be
ConfigObj-based) with some way of centrally accessing and updating (e.g. via a
repository). When updating, an actual plan should be stored and validated.
|