New Apport feature: custom bug duplicate identification

Apport has provided built-in support for automatically identifying and marking duplicate bug reports for normal signal as well as Python crashes. However, we have more kinds of bug reports submitted through Apport which could benefit from automatic duplication: X.org GPU freezes, package installation failures, kernel oopses, or gcc internal compiler errors, i. e. pretty much everything that gets reported automatically these days.

The latest Apport 1.20 (which also just hit current Ubuntu Natty) now allows package hooks to set a special field DuplicateSignature, which abstracts the concept for other kinds of bug reports where Apport doesn’t do automatic duplication. This field should both uniquely identify the problem class (e. g. “XorgGPUFreeze”) as well as the particular problem, i. e. variables which tell this instance apart from different problems. Aside from these requirements, the value can be any free-form string, Apport only treats it as an opaque value. It doesn’t even need to be ASCII only or only be one line, but for better human inspection I recommend this.

So your report could do something like

report['DuplicateSignature'] = 'XorgGPUFreeze: instruction %s regs:%s:%s:%s' % (
                     current_instruction, regs[0], regs[1], regs[2])

or

report['DuplicateSignature'] = 'PackageFailure: ' + log.splitlines()[-1]

This is integrated into Apport’s already existing CrashDatabase class, which maintains a signature →master bug mapping in a SQLite database. So far these contained the crash signatures (built from executable name, signal number, and the topmost 5 stack trace names). As usual, if an incoming report defines a duplicate signature (from the crash stack trace or from DuplicateSignature), the first one will become the master bug, and all subsequent reports will automatically get closed as a duplicate in Launchpad.

Thanks to Bryce Harrington, who already came up with using this in the latest Intel X.org graphics driver for GPU hangs!