hachoir-metadata is tool to extract metadata from multimedia files (sound, video, archives, etc.): see output examples!
Features
- Gtk interface
- Plugins for Nautilus (Gnome) and Konqueror (KDE)
- Support invalid / truncated files
- Unicode compliant (charset ISO-8859-XX, UTF-8, UTF-16), convert string to your terminal charset
- Remove duplicate values (and if a string is a substring of another, just keep the longest one)
- Set priority to value, so it's possible to filter metadata (option --level)
- Only depends on hachoir-parser (and not on libmatroska, libmpeg2, libvorbis, etc.)
Download
- Download and install hachoir-metadata
- Browse source code online
- Hachoir-metadata on Python Cheeseshop
Code example
See hachoir-metadata code example.
Supported file formats
Total: 33 file formats.
Archive
- bzip2: bzip2 archive
- cab: Microsoft Cabinet archive
- gzip: gzip archive
- mar: Microsoft Archive
- tar: TAR archive
- zip: ZIP archive
Audio
- aiff: Audio Interchange File Format (AIFF)
- mpeg_audio: MPEG audio version 1, 2, 2.5
- real_audio: Real audio (.ra)
- sun_next_snd: Sun/NeXT audio
Container
- matroska: Matroska multimedia container
- ogg: Ogg multimedia container
- real_media: RealMedia (rm) Container File
- riff: Microsoft RIFF container
Image
- bmp: Microsoft bitmap (BMP) picture
- gif: GIF picture
- ico: Microsoft Windows icon or cursor
- jpeg: JPEG picture
- pcx: PC Paintbrush (PCX) picture
- png: Portable Network Graphics (PNG) picture
- psd: Photoshop (PSD) picture
- targa: Truevision Targa Graphic (TGA)
- tiff: TIFF picture
- wmf: Microsoft Windows Metafile (WMF)
- xcf: Gimp (XCF) picture
Misc
- ole2: Microsoft Office document
- pcf: X11 Portable Compiled Font (pcf)
- torrent: Torrent metainfo file
- ttf: TrueType font
Program
- exe: Microsoft Windows Portable Executable
Video
- asf: Advanced Streaming Format (ASF), used for WMV (video) and WMA (audio)
- flv: Macromedia Flash video
- mov: Apple QuickTime movie
Options
Modes --mime and --type
Option --mime ask to just display file MIME type:
$ hachoir-metadata --mime logo-Kubuntu.png sheep_on_drugs.mp3 wormux_32x32_16c.ico logo-Kubuntu.png: image/png sheep_on_drugs.mp3: audio/mpeg wormux_32x32_16c.ico: image/x-ico
(it works like UNIX "file --mime" program)
Option --file display short description of file type:
$ hachoir-metadata --type logo-Kubuntu.png sheep_on_drugs.mp3 wormux_32x32_16c.ico logo-Kubuntu.png: PNG picture: 331x90x8 (alpha layer) sheep_on_drugs.mp3: MPEG v1 layer III, 128.0 Kbit/sec, 44.1 KHz, Joint stereo wormux_32x32_16c.ico: Microsoft Windows icon: 16x16x32
(it works like UNIX "file" program)
Filter metadatas with --level
hachoir-metadata is a too much verbose by default:
$ hachoir-metadata logo-Kubuntu.png Image: - Image width: 331 - Image height: 90 - Bits/pixel: 8 - Image format: Color index - Creation date: 2006-05-26 09:41:46 - Compression: deflate - MIME type: image/png - Endian: Big endian
You can skip useless information (here, only until level 7):
$ hachoir-metadata --level=7 logo-Kubuntu.png Image: - Image width: 331 - Image height: 90 - Bits/pixel: 8 - Image format: Color index - Creation date: 2006-05-26 09:41:46 - Compression: deflate
Example to get most importation informations:
$ hachoir-metadata --level=3 logo-Kubuntu.png Image: - Image width: 331 - Image height: 90 - Bits/pixel: 8 - Image format: Color index
Getting help: --help
Use --help option to get full option list.
Integration of hachoir-metadata
- amplee is a Python implementation of the Atom Publishing Protocol (APP)
- Plone? (project in snow sprint 2007)
- See also plone4artists-sprint
TODO
- #83
- TIFF metadata are poor
- #88
- Extract file type: MIME type, ontology or something else?
- #125
- Use Dublin Core
- #172
- Hachoir metadata returns wrong MIME type
See also
See also: file format resources.
Informations
- (fr) DCMI Metadata Terms: Classification of meta-datas done by the Dublin Core
- (fr) Dublin Core article on Openweb website
- (fr) avi_ogminfo : Informations about AVI and OGM files
- (en) Xesam (was Wasabi): common interface between programs extracting metadata
Libraries
- (fr|en) MediaInfo (GPL v2, C++)
- (en) Mutagen: audio metadata tag reader and writer (Python)
- (en) getid3: Library written in PHP to extact meta-datas from several multimedia file formats (and not only MP3)
- (fr|en) libextractor: Library dedicated to meta-data extraction. See also: (en) Bader's Python binding
- (en) Kaa (part of Freevo), it replaces mmpython (Media Metadata for Python) (dead project)
- (en) ExifTool: Perl library to read and write metadata
Programs
- jpeginfo
- ogginfo
- mkvinfo
- mp3info
Programs using metadata
Attachments
- hachoir-metadata-gtk.png (29.3 kB) -
Screenshot of hachoir-metadata-gtk version 1.1 on MP3 music with ID3 tags
, added by haypo on 04/01/08 19:58:45.
