Hachoir 1.0

Classes

  • BasicFieldSet
    • SeekableFieldSet
      • RootSeekableFieldSet
    • GenericFieldSet
      • FieldSet

Problems

  • Too many classes
  • Code duplication
  • Code is very complex
  • Code is not generic
  • Some bugs in SeekableFieldSet classes
  • It's not possible to use a FieldSet or RootSeekableFieldSet as subfield

Problematic parsers

  • EXE RES, EXIF, TIFF: fields in random order
  • OLE2, File system: fields can be anywhere in the disk

Hachoir 2.0

Write new field set classes.

Methods and attributes

  • Store 1 or more children
  • Field order:
    • by address? (actual FieldSet)
    • by logical order? (actual SeekableFieldSet)
  • Have a size?
  • Have an address?
  • fieldset["name"]: Get a field by its name
  • fieldset[0]: Get a field by its index (?)
  • for item in fieldset: Iterate on fields

Utilities:

  • validate() method?
  • description attribute
  • value attribute

File system

  • Fields can be anywhere
  • No address order => use logical order
  • Field set has no size or its size is the size of the stream
    • Problem: Hachoir needs size+address attributes
  • Multiple indirections (3 levels in EXT2, 2 levels in FAT, 3 levels in FAT bigger than 6 MB)

Another idea

  • Union like C structure union: have two different (or more) "view" of the same data
    • Same address
    • Maybe same size
    • Different parsing
  • It should also be possible using SeekableFieldSet since it doesn't check if two fields are stored at the same address

Compatibility with Hachoir 1.0

  • Write SeekableFieldSet, RootSeekableFieldSet, GenericFieldSet, FieldSet classes based on new API but with the old API