Structures

Structures are ordered collections of fields, organizing data into a consistent format. They’re defined as Python classes, with fields as their attributes, in a pattern that’s now fairly familiar from other Python frameworks.

Discussing structures requirings including some fields, but for more information on how they work, as well as specific field types, see Fields.

Structure Definition

A structure is defined by subclassing steel.Structure and declaring field attributes:

import steel

class NetworkPacket(steel.Structure):
    header = steel.Integer(size=2)
    message = steel.NullTerminatedString(encoding="ascii")
    checksum = steel.Integer(size=2)

Each field in the structure corresponds to a piece of data that can be read or written in a predictable format. The order of fields must match the order that the data appears in the original format.

Working with Structure Instances

Creating Instances

You can create structure instances in several ways:

# Direct instantiation with keyword arguments
packet = NetworkPacket(header=1234, message="Hello", checksum=0xABCD)

# Empty instantiation and attribute assignment
packet = NetworkPacket()
packet.header = 1234
packet.message = "Hello"
packet.checksum = 0xABCD

Reading from Buffers

Use the load() class method to parse binary data:

import steel
from io import BytesIO

class NetworkPacket(steel.Structure):
    header = steel.Integer(size=2)
    message = steel.NullTerminatedString(encoding="ascii")
    checksum = steel.Integer(size=2)

binary_data = b"\xd2\x04Hello\x00\xcd\xab"
buffer = BytesIO(binary_data)
packet = NetworkPacket.load(buffer)

print(packet.header)    # 1234
print(packet.message)   # "Hello"
print(packet.checksum)  # 0xABCD

If you don’t have (or want) a file-like object, you can also use loads() to read from a sequence of bytes instead.

packet = NetworkPacket.loads(binary_data)

print(packet.header)    # 1234
print(packet.message)   # "Hello"
print(packet.checksum)  # 0xABCD

Writing to Buffers

Use the dump() method to serialize data back to binary format:

packet = NetworkPacket(header=1234, message="Hello", checksum=0xABCD)

output = BytesIO()
bytes_written = packet.dump(output)

binary_data = output.getvalue()
print(f"Wrote {bytes_written} bytes")

If you don’t have (or want) a file-like object, you can also use dumps() to return a sequence of bytes instead.

binary_data = packet.dumps()
print(f"Wrote {len(binary_data)} bytes")

Field Order and Layout

Fields are processed in the order they’re declared in the class definition. This determines both the order of reading from buffers and writing to buffers:

import steel

class NetworkPacket(steel.Structure):
    header = steel.Integer(size=2)        # Read/written first
    message = steel.NullTerminatedString(encoding="ascii")  # Read/written second
    checksum = steel.Integer(size=2)      # Read/written third

Error Handling

If you try to access an attribute that wasn’t set during instantiation, you’ll get an AttributeError:

import steel

class NetworkPacket(steel.Structure):
    header = steel.Integer(size=2)
    message = steel.NullTerminatedString(encoding="ascii")
    checksum = steel.Integer(size=2)

packet = NetworkPacket(header=1234)  # Only header set
print(packet.header)    # Works: 1234
print(packet.message)   # Raises AttributeError

Validation

Structures support basic validation to ensure all field values conform to their expected formats and constraints. This helps catch data integrity issues before writing to buffers or after reading from potentially corrupted data.

Important

Validation is _not_ performed automatically. Many projects don’t need it, and many more don’t need it to happen every time a structure is written out, so it’s a separate step. For cases that do need it, validation is sipmle to perform, so this shouldn’t be too onerous a requirement.

Basic Validation

Use the validate() method to check that all fields in a structure contain valid values:

import steel

class NetworkPacket(steel.Structure):
    header = steel.Integer(size=2)
    message = steel.NullTerminatedString(encoding="ascii")
    checksum = steel.Integer(size=2)

packet = NetworkPacket(header=1234, message="Hello", checksum=0xABCD)
packet.validate()  # Raises ValidationError if any field is invalid

The validation process checks that each field has a value and is valid, according to its specific constraints. See the documentation for each field for details on its validation behavior.

Handling Validation Errors

When validation fails, a ValidationError is raised with details about the problem:

import steel

class NetworkPacket(steel.Structure):
    header = steel.Integer(size=2)
    message = steel.NullTerminatedString(encoding="ascii")
    checksum = steel.Integer(size=2)

packet = NetworkPacket(header=70000, message="Hello", checksum=0xABCD)  # Header too big

try:
    packet.validate()
except steel.ValidationError as e:
    print(f"Validation failed: {e}")

Common validation scenarios that raise errors:

import steel

class NetworkPacket(steel.Structure):
    header = steel.Integer(size=2)
    message = steel.NullTerminatedString(encoding="ascii")
    checksum = steel.Integer(size=2)

packet = NetworkPacket(header=70000, message="Hello", checksum=0xABCD)
packet.validate()  # ValidationError: value exceeds maximum

packet = NetworkPacket(header=1234, message="héllo", checksum=0xABCD)
packet.validate()  # ValidationError: invalid encoding

Note

If multiple fields are invalid, _one_ ValidationError will be raised, for the field field that failed to validate. A future update may include an API to retrieve multiple validation errors in one pass.

Validation with Missing Fields

If a field hasn’t been assigned a value, validation will also raise a ValidationError:

import steel

class NetworkPacket(steel.Structure):
    header = steel.Integer(size=2)
    message = steel.NullTerminatedString(encoding="ascii")
    checksum = steel.Integer(size=2)

packet = NetworkPacket(header=1234)  # Missing message and checksum
packet.validate()  # ValidationError

This ensures that all required fields are present before attempting to write the structure to a buffer.

Validating After Reading

Validation is also useful after reading binary data to verify the data integrity:

import steel
from io import BytesIO

class NetworkPacket(steel.Structure):
    header = steel.Integer(size=2)
    message = steel.NullTerminatedString(encoding="ascii")
    checksum = steel.Integer(size=2)

# Read potentially corrupted data
binary_data = some_binary_source()
buffer = BytesIO(binary_data)

try:
    packet = NetworkPacket.load(buffer)
    packet.validate()  # Verify the parsed data is valid
    print("Data successfully validated")
except steel.ValidationError as e:
    print(f"Corrupted data detected: {e}")

Warning

This approach only works if all the fields can at least read the data into the structure. If any field fails to even get that far (such as invalid text for a specified encoding), field-specific exceptions can be raised during load(), so you should prepare for that as well.

Best Practices

  1. Validate before writing: Call validate() before writing to ensure complete, valid data.

  2. Handle missing fields: Use try/except blocks to gracefully handle incomplete structures.

  3. Validate incrementally: For complex structures, consider validating fields as you set them rather than waiting until the end.

  4. Validate after reading: Always validate structures after reading from external sources to catch data corruption early.

  5. Prepare for exceptions during reading: Don’t assume that every file can be read well enough to be able to call validate() on the result.

import steel

class NetworkPacket(steel.Structure):
    header = steel.Integer(size=2)
    message = steel.NullTerminatedString(encoding="ascii")
    checksum = steel.Integer(size=2)

# Good practice: validate after reading unknown data
def parse_file(filepath):
    with open(filepath, "rb") as f:
        try:
            packet = NetworkPacket.load(f)
            packet.validate()
            return packet
        except steel.ValidationError:
            raise ValueError(f"Invalid file format: {filepath}")

# Good practice: ensure completeness before writing
def write_packet(packet, output):
    packet.validate()  # Ensures all fields are present and valid
    return packet.write(output)

Advanced Usage

Configuring fields at the structure level

Structures can contain many fields with similar configuration options, such as byte ordering or text encoding. You can configure each of these fields individually, but to simplify the structure definition, you may also configure these options at the structure level. Structures can be configured with global options that affect all fields on that structure. In addition to supplying steel.Structure as a base class, you can specify many options as keyword arguments when defining the class.

import steel

class NetworkPacket(steel.Structure, endianness=">", encoding="ascii"):
    header = steel.Integer(size=2)  # Will encode big-endian values
    message = steel.NullTerminatedString()  # Will use ASCII encoding
    checksum = steel.Integer(size=4, endianness="<")  # Overrides to little-endian

Note

Option specified on the structure will override any defaults defined in the fields, but configuring individual fields will take priority over anything specified on the structure.

This is especially helpful for large structures that repeat a lot of the same kind of field, because a format is typically consistent about how its data is represented. Configuring these options on the structure itself can save a lot of duplication throughout the fields themselves.

Warning

Not every field option can be specified on the structure. Consult the Fields documentation for details about each field’s behavior.

How missing values are handled

Because binary data doens’t typically have headings for each value like JSON or YAML, there’s often no easy way to write the data out when values are missing. Therefore, the default behavior is to raise an AttributeError when accessing any field that yet doesn’t have a value, including when writing to a data buffer.

Some fields can also have default values, which will allow you to write data even if you haven’t supplied a value for a given field. Check each field’s documentation for details.

Configuration Access

Danger

While this may be useful for certain applications, _config is not yet a stable API. It’s meant for internal use and shouldn’t be necessary for the vast majority of Steel usage. It’s included here for use cases that can’t be handled any other way, for users who understand the risks and are willing to accept breakage in future releases.

Each structure class has a _config attribute that provides access to the field configuration, which can be useful for introspection and dynamic field processing.

import steel

class NetworkPacket(steel.Structure):
    header = steel.Integer(size=2)
    message = steel.NullTerminatedString(encoding="ascii")
    checksum = steel.Integer(size=2)

# Access all fields
for name, field in NetworkPacket._config.fields.items():
    print(f"Field {name}: {field.__class__.__name__}")

# Access specific field
header_field = NetworkPacket._config["header"]
print(f"Header field size: {header_field.size}")

This configuration option has the following attributes:

  • fields is a dictionary of the fields that are specified on the structure. Because Python dictionaries are ordered by default, iterating over this dictionary – or its keys or values individually – will yield fields in the correct order.

  • options is a dictionary of field options that were supplied at the structure level. This will contain everything that was supplied in the class definition, regardless of whether it actually overrode any pariticular field’s configuration.