Size
In Hachoir, the size of fields and streams isn't a trivial subject, for two reasons:
- Sometimes, when a parser is about to create a field, there are several ways to determine the size and there is a preferred one.
- The size of the stream may be unknown.
A few rules to follow when you write a parser
Setting the size of the field that is to be created
The core handles any invalid size so instead of computing a size from the size of other fields, don't worry if the file is corrupted and do rely, if possible, on the information stored in it.
More precisely:
- The idea is to check the greatest amount of data in the file.
- Don't use the size of the root field if possible, since it may be unknown.
Number of fields to add
For the same reasons than for the size, read the file for the number of fields to add, in order to avoid comparing self.current_size and self.size.
If it's not possible, and if you want to fill your field set, there is another method: GenericFieldSet provides a eof member that tests if the end of the field is reached or not. Using
>>> not self.eof
instead of
>>> self.current_size < self.size
is mandatory for the root field. For other fields, unless eof is faster, write what you want.
The root field
Padding data at the end
That's a special case where you are allowed to use self.size. Use something like:
>>> if self.current_size < self.size: ... yield self.seekBit(self.size, "end")
For streams with an unknown size, the field won't be created but who cares? It's better than reading the whole stream to set its size.
If you still need the size
Really? That's problematic. That means that Hachoir won't support piped inputs for your format.
Ok, do what you can and aborts if self.size is None. If you need the size at the beginning of createFields, simply test self.size in validate.
Searching for data until the end of the file
Avoid scanning the file in validate. That would break the support of piped inputs.
To know if you are allowed to do that, test if self.stream.checked is true.