Next releases

Bugfix commit:

News between 0.7 and 0.8

hachoir-core

  • New type: Float80 (80-bit flotting point number), needed by AIFF sound format

hachoir-parser

  • New parser: Audio Interchange Format File (AIFF), MIDI audio, Linux swap file, WMF picture, Real audio (.ra), Truevision Targa Graphic (.tga), Real Media (.rm)

hachoir-metadata

  • New extractors: Audio Interchange Format File (AIFF), MIDI audio, WMF picture, Real audio (.ra), Truevision Targa Graphic (.tga), Real Media (.rm)

News between 0.6 and 0.7

Key changes

  • Support decompression (gzip, rpm, swf, bzip2)
  • New program hachoir-wx, GUI based on wxWidgets working on Windows, Linux and Mac OS X

hachoir-core

  • Editor: support Float32, Float64, Character
  • Parser: don't have mime_type or tags attributes anymore
  • GenericString: fix UnixLine and remove ISO-8859-12 charset (doesn't exist)
  • SubFile: add optional argument parser
  • Float32/64 are now field set, so it's possible to get sign, exponent and mantissa
  • Rename this component from "hachoir" to "hachoir-core"
  • New field types: SubFile, EncodedFile, GenericVector, UserVector, Int24, UInt24
  • Field API:
    • Rename getOriginalDisplay() to raw_display, property using createRawDisplay()
    • Rename _createValue() to createValue()
    • Rename _createDescription() to createDescription()
  • GenericFieldSet:
    • Rename getExistingFieldByAddress() to getFieldByAddress() and add feed optional argument
    • Use cache for array() method
    • Fix _fixFieldSize() method for field set with invalid size (nul or negative size)
  • StringInputStream: add optional source argument
  • InputSubStream: use _offset attribute to store offset, instead of address
  • Move export_xml.py to hachoir-console

hachoir-parser

  • New parsers: SWF (Flash), FLV (Flash video), JAVA class, Ogg/Vorbis, Ogg/Theora
  • MPEG audio: rewrite getFrameSize(), now works on all MPEG version (use code from ffmpeg project), better file validation (detect less false positive)
  • image.common: Palette is now PaletteRGB and is based on UserVector class
  • JPEG: Fix JpegChunk for Start Of Image and End Of Image chunks (don't have any content nor size); better file validation
  • ELF: Set minimum size of 36 bytes instead of 4
  • Create function parseStream()
  • New Parser class based on the simple Parser class from hachoir-core
  • Split run_testcase.py in three: download_testcase.py, run_testcase.py for hachoir-parser and run_testcase.py for hachoir-metadata

hachoir-metadata

  • New extractors: SWF (Flash), FLV (Flash video), Ogg/Vorbis, Ogg/Theora
  • Add "bits/sample", used by AU, WAV and MPEG audio extractors
  • Add "language" key, used by Matroska extractor
  • Add "aspect_ratio", used by Ogg/Theora
  • Matroska can also extract audio comment and codec

hachoir-urwid

  • Can use new SubFile informations when parsing sub-stream (filename, mime type, parser)

News between 0.5 and 0.6

List of visible changes at user side:

  • Hachoir is now able to edit a file (see wiki:hachoir-editor)
  • Scripts:
    • hachoir-urwid: fix switch between human and real value of integer and string, new option --force-mime
    • hachoir-strip: new script to remove producer informations, timestamps, metadata, useless padding, etc.
  • Parsers:
    • New parsers: Abstract Syntax Notation One (ASN.1), basic MPEG video parser, Tcpdump file (Ethernet, IPv4, ARP, ICMP, TCP, TCP options, UDP), ZSNES save (by Jason Gorski), 3DO model (by Cyril Zorin), Spider-Man video (by Mike Melanson)
    • Rewrite autofix feature: Hachoir can now fix most parser errors when at least one parent size is know
    • MPEG audio: support padding between frames, better file validation, guess if bit rate is constant (CBR) or variable (VBR)
    • Python PYC: rewritten from scratch, now support python 1.5 to 2.5
  • I18n of Hachoir:
    • Most strings are now Unicode string and not byte string
    • Use gettext (using "_" alias) and ngettext (singular/plural form) to translate text
    • Hachoir scripts, urwid interface, metadata extractor and most kernel functions are translated in french
  • Add support for piped input. In other words:
    • Data are cached to allow backward seeking and data in cache are discarded automagically
    • The core try to do the most without knowing the size of the stream
  • Other:
    • Improve "external links": (in urwid) Remove 'f' key, 'space' is enough.
    • Hachoir runs on IronPython (v1.0), pypy, Python 2.2 to 2.5, Stackless 2.4 to 2.5, and Jython 2.2 (see wiki:Compatibility to get more details)

List of visible changes at developer side:

  • GenericString: support UTF-16 and UTF-32 with BOM, fix string length in character
  • New field types: NullBits, NullBytes, SubFile
  • Create GenericFieldSet.array() method: self["name[%u]" % index] <=> self.array("name")[index]
  • Benchmark:
    • Write Benchmark class which automatically compute number of calls
    • Write bench.sh to run all benchmarks
  • Remove on old and useless dependency: python-xml

See also