The APX file format
The APX File has a variable size because it stores a lot of different information.
In general terms, the APX file is composed by a first section with general information of the strategy (timestamps, versions, owners, CRCs, the internal strategy password). This section is about 2000 bytes long.
The next section determines the number of Holding registers (MW%), register constants (KW%) and coils (M%) and store all the constant values. This section size depends on the number of constants specified.
The next section stores the error messages that could send the PLC to Unity Pro.
The next section stores network information (IPs, network masks, SNMP configuration, NTP configuration, SMTP configuration, etc.) of all the networking modules associated with the PLC.
The next section stores the strategy in binary format. It also stores chunks of bytes which are ZLIB compressed, when uncompressed they turned to be XML files with configuration information.
So the main structure is:
- General information section
- Registers and coils
- Error messages
- Network information
- Strategy in binary format
General information section
This section comprises of a number of subsections, separated by a “0E <16-bit CRC> 00 00” separator.
The first subsections are the following:
As we can see, these first blocks, start with 2 or 3 ASCII character (APX subsection, SD subsection, RT subsection…)
At the end of all these blocks, we can see the signature prevoiously mentioned (“E0 16-bit CRC 00 00”).
In the first block, bytes 11 to 15 are 4 byte CRC, in this case “00 C8 2D 0D” that will be repeated in 4 or 5 places of the first section.
We think that 16-bit-CRC values we find all along the APX file validate the starting subsection (they always appear in the beginning of a subsection (or the end of the previous one, which is the same).
Subsections This first subsection is two bytes smaller when it belongs to the TSX premium PLC. The two different bytes seem to be two “00” bytes before the end of the subsection (before “93E0”).
The second subsection is also four bytes smaller when it belongs to the TSX Premium PLC.
Strange enough, the third subsection (corresponding to “RT” ASCII characters) is much longer in the TSX PLC ( 110 vs 64 bytes).
This is because in a TSX PLC APX file, subsections 4 and 5 do not exist and its information is stored (in short) in those extra 56 bytes (prohect date and timestamp, versions, etc.).
This would be the fourth subsection:
This is the fifth subsection:
This subsection stores project information like timestamps, versions, etc, and it's accessed when a message “00 03 00” is sent to the PLC.
And this is the sixth subsection, it also stores some information about the project.
After the project version information come 1024 (0400h) blank bytes intended to store the project description.
And these are the following subsections. Note that Subsection 7 is sent when a “00 03 04” request is done.
However this is not the structure of this section if take into account how the file is read or write in the upload / download process:
The thin blue lines indicate how the bytes are read or write during the strategy upload/download process. It has nothing to do with the separators previously mentioned.
In the red boxes we can see the four-byte-CRC value which is repeated at least 4 time all along the APX file. The small red boxes are the 16-bit CRC values. These 16 bit CRC change on every regeneration of the project, even when the project has NOT changed at all.
Section 2. Registers and coils
In section 2 there are no subsections any more.
Starting in offset AA0h come a bunch of bytes that define the number of bytes stored by holding registers (MW%), constants (KW%, in offset B2 – FF 01 – 1023 bytes – 512 register constants) and coils (M%).
After B00 line come as many bytes as indicated in the KW% constant value (0b02h). In this case they are 01 FF (little endian) bytes. That means that the register constants will be stored in the APX file from 0B0Ch offset to 0D0Bh offset.
Section 3. Error messages
Starting on 0D0Ch, start a block with all error messages:
This block ends with four bytes set to FF (first black box):
It's important to note that both this section and the next start with four bytes “01 00 00 00”, and that subsections seem to start “01 00 A0 00”
Section 4. Network information
Following the end of the error message list, we can see two subsection separators and in the third subsection we can see, in the three coloured boxes the internal IPs of our PLC (in little endian format). In this case “CD 0D 1E 0A” stands for 10.30.13.205, “0A 1E 0D 01” stands for 10.30.13.1 (the gateway IP) and “00 FF FF FF” is the netmask, 255.255.255.0.
After more than 3000 “00” bytes come the next network information.
In offset 2350h (in our case) appear the SNMP configuration, with snmp keys (“public1, public2, public3”), SysLocation (“KKKKKKK...”) and SysContact data (“123456789...”), administration IP and other SNMP information.
Next comes SMTP information with the SMTP server IP address, port, username (“root”), password (“123456789012”). It also stores SMTP From, To and subject information.
This information could allow a malicious attacker to send emails on behalf the PLC.
Then it comes a section of repetitive characters:
In blue we can see the TSX automate APX File and in white the Modicon M340 one. There are some differences but it's a list of “DC053C” values.
From my point of view, this is a way of synchronization.
If the strategy has several cards configured, a “FF FF FF FF” separator comes and the process start again.
Section 5. Binary Strategy code
After the Network information come what we think is the strategy binary code and some blobs of ZLIB bytes.
This section is divided in “PK” subsections:
As we can see after the corresponding “?? E0 CRC16 00 00” signature, comes a “PK” signature an another PK, compressed subsection comes.
This occurs up to 9 times until the end of OUR strategy. Some of these PK files seem to be compressed data, other, with this aspect seem binary data:
The PK signatures appear in our strategy in offsets:
- 000050F0h
- 000052B0h
- 00005C10h
- 00005CF6h
- 000064C4h
- 000066A5h
- 000066FAh
- 000069C4h
- 00006AEDh
- 000010065h
- 000010207h
As we can see, their size is small except for one, which is the only one with binary aspect.
Using the binwalk tool it found at least 5 ZLIB blobs, stored in two different PK subsections.
When decompressed these ZLIB blobs found that some of them were compressed XML files, while other stored ST functions:
On the big PK subsection we can even understand the binary code. We tried a simple program and tried to find it in the binary code. And we found it:
Written in our own assembler language we could understand where was it stored and how was it run.
Changes in the APX after a regeneration with no changes at all
We have detected that regenerating a project with no change in it, the APX file generated contains 36 different bytes.
The different bytes are the following:
APX subsection 00000Bh (4 bytes) → CRC32 value SD Subsection 00003Ch (2 bytes) → SD subsection CRC16 value 000067h (4 bytes) → Copy of the CRC32 value 00006Eh (1 byte) → Regeneration Version number RT Subsection 000164h (2 bytes) → RT subsection CRC16 value 000284h (2 bytes) → 4th Subsection CRC16 value 0002AFh (3 bytes) → Timestamp 0002B7h (3 bytes) → same timestamp 0002E5h (4 bytes) → Copy of the CRC32 value 0002EDh (4 bytes) → Copy of the CRC32 value 0002F1h (4 bytes) → Copy of the CRC32 value 000301h (1 byte) → Unknown 000355h (1 byte) → Version number 000359h (1 byte) → Unknown
Modification of the file
We were unable to upload an strategy to the PLC with ANY modification. And this must be caused by the CRC checks.
¿Why do we know it's a CRC value?
Running an strings command against the Firmware binary file we find the following function names (related with CRC):
It's clear that CRC16 flags and CRC 32 values are computed all along the APX File.
I tried to calculate the CRC 32 value of entire file, starting from byte 1 to byte 3000. None of them did the trick.
I also tried to calculate 16-bit CRC values of the different subsection. Couldn't get the correct CRC.
As mentioned earlier, we think that 16-bit-CRC values we find all along the APX file validate the starting subsection (they always appear in the beginning of a subsection (or the end of the previous one, which is the same).