What is really in the header of binary files?

Anything QL Software or Programming Related.
Robbizz
ROM Dongle
Posts: 11
Joined: Mon May 29, 2023 11:23 pm

What is really in the header of binary files?

Post by Robbizz »

Hi everyone,
I recently built a serial cable for Sinclair QL and I'm trying to understand how to juggle the binary files of my QL and how to transfer the latter from the PC to the QL while keeping the header of these files intact.
I know it's a much discussed topic online and there are many suggestions/solutions to the problem; but what I want to understand is what the header actually contains, what it is for and also how to preserve it.

1. First of all, I often find that Windows destroys the header of binary files as soon as you unpack a Sinclair QL .zip file. But only Windows? Linux, Mac doesn't do it?

2. Does every executable binary file have a header? I often hear talk that "probably" if unzipped out of Sinclair QL it won't work. So sometimes there might not even be a header?

3. Then, following up with the questions: what does the header of binary files consist of?
I found this explanation online:

Code: Select all

Header:

[255]
[length highest byte]
[length]
[length]
[length lowest byte]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[byte x length of actual data...]

So, sending a file with a length of 16 bytes would mean sending this data over the serial:

255,0,0,0,16,0,0,0,0,0,0,0,0,0,0,11,22,43,54,25,64,17,108,19,20,44,22,13,11,10,6

(The 16 bytes here are just bogus numbers).
It's really like this? I have doubts because by opening an executable binary file, where I managed to make it work with a real Sinclair QL (not an emulator), I opened that file with a hex editor and at the beginning of the file I did not find any of the codes exposed above.

3. From what I can understand (see point 3), the header of a binary executable file is used to inform the system of how much memory the system must reserve for that program (of course, correct me because this is my assumption). So, I thought, what if a command like:

Code: Select all

EXEC MDV1_NOMEPROG
It gives me as a result:

Code: Select all

BAD PARAMETER
It means that program has lost its header. Well, I said to myself; because if I can't get it to run with that command, I can reserve the memory myself with a few lines of SuperBASIC. If the program (for example Qterm) is 23172 Bytes large I can do it like this:

Code: Select all

23172 / 1024 = 22.63 
So I'll reserve 23 * 1024:

Code: Select all

110 qterm=RESPR(23*1024)
120 LBYTES mdv1_qterm_cde,qterm
130 CALL qterm
And it actually works (with Qterm), but not always because I managed to transfer all the programs that make up the GST Assembler micro cassette to QL, and a:

Code: Select all

EXEC mdv1_asm
He gives me:

Code: Select all

BAD PARAMETER
So I thought about reserving space manually by calculating the size of the program, as I did in the example with Qterm, but the QL freezes without displaying the flashing cursor, forcing me to do a reset.
Unfortunately I can't use programs like unzip under QL, this is because they are heavy: I have a QL that doesn't expand (128K) and I use Qemulator which, in the basic version, doesn't support more than 128K. I therefore thought of using a microdrive (in the QL emulator) to open the .zip files and using the second microdrive, configured with a folder in my Mac's file system. Once this is done, I manually transfer the files from microdrive 1 to the 2 (always under emulator) and then from the folder I send them via serial to the real QL.


User avatar
dilwyn
Mr QL
Posts: 2761
Joined: Wed Dec 01, 2010 10:39 pm

Re: What is really in the header of binary files?

Post by dilwyn »

If your QL has Toolkit 2, you could use the COPY_H extension to force copying of a header. Or COPY_N to force copying without a header, e.g. if copying to a printer where any header bytes might accidentally act as control codes to put printers in random modes and spoil the listings.

But if that header had been lost on a native format drive which did not store QL executable file headers in the first place, even that can't fully restore the header.

So disappointing that after 40 years the principle of lost executable headers is still not understood despite it being explained everywhere.

If you have an emulator, there is a potential extra problem - some emulators get around the loss of executable file headers on native format drives by adding a preamble equivalent to the file header at one end of the file. Short of knowing how to strip these, you're onto a loser if that was the case.

But if you do have an emulator, you can use my Job2bas program on the sending machine to convert an executable to a BASIC program temporarily (which is not affected by loss of file headers), transfer that and you run it at the opposite end to recreate the executable.

If your EXEC program gives 'bad parameter' errors, it indicates loss of executable headers as you said. If you know what you're doing, you might be able to recreate that header by loading it temporarily into memory and using an SEXEC command to save it with header. Some trial and error will be needed for the dataspace value if you don't know it:

fl = (length of executable program in bytes)
base=respr(fl)
LBYTES mdv1_no_header_prog_exe,base
dataspace_value = (your best guess if you don't know it)
SEXEC mdv1_fixed_exe,base,fl,dataspace_value

If you make dataspace_value too small, the program won't have enough space for its data and will probably complain with an error message. If you make it too big, the program might work, but the too-large value is wasteful of memory, especially on a 128K machine.


Robbizz
ROM Dongle
Posts: 11
Joined: Mon May 29, 2023 11:23 pm

Re: What is really in the header of binary files?

Post by Robbizz »

dilwyn wrote: Sun Dec 03, 2023 4:38 pm If your QL has Toolkit 2, you could use the COPY_H extension to force copying of a header. Or COPY_N to force copying without a header, e.g. if copying to a printer where any header bytes might accidentally act as control codes to put printers in random modes and spoil the listings.
This is very interesting, but unfortunately my QL has a ROM JM.
dilwyn wrote: Sun Dec 03, 2023 4:38 pm But if that header had been lost on a native format drive which did not store QL executable file headers in the first place, even that can't fully restore the header.
This is something I'll have to check out.
dilwyn wrote: Sun Dec 03, 2023 4:38 pm So disappointing that after 40 years the principle of lost executable headers is still not understood despite it being explained everywhere.
I'm new to QL, but eager to learn.
dilwyn wrote: Sun Dec 03, 2023 4:38 pm But if you do have an emulator, you can use my Job2bas program on the sending machine to convert an executable to a BASIC program temporarily (which is not affected by loss of file headers), transfer that and you run it at the opposite end to recreate the executable.
I tried to use it just yesterday but I can't get it to work. I saw that there is some documentation attached (JOB2BAS.doc) but it doesn't seem to be a text file and therefore I can't read its contents not knowing which program I could open it with (if it's done with a program that runs under QL it wouldn't have makes a lot of sense). If I try to convert the asm program (under emulator) I type the source as: mdv1_asm and then the basic file that will be saved, for example: mdv1_asm_basic and it gives me this error: At line 550 in use...
dilwyn wrote: Sun Dec 03, 2023 4:38 pm If your EXEC program gives 'bad parameter' errors, it indicates loss of executable headers as you said. If you know what you're doing, you might be able to recreate that header by loading it temporarily into memory and using an SEXEC command to save it with header. Some trial and error will be needed for the dataspace value if you don't know it:

fl = (length of executable program in bytes)
base=respr(fl)
LBYTES mdv1_no_header_prog_exe,base
dataspace_value = (your best guess if you don't know it)
SEXEC mdv1_fixed_exe,base,fl,dataspace_value

If you make dataspace_value too small, the program won't have enough space for its data and will probably complain with an error message. If you make it too big, the program might work, but the too-large value is wasteful of memory, especially on a 128K machine.
Fantastic! Thank you!


User avatar
dilwyn
Mr QL
Posts: 2761
Joined: Wed Dec 01, 2010 10:39 pm

Re: What is really in the header of binary files?

Post by dilwyn »

JM QL still allows use of Toolkit 2 extensions. TBH, a QL without Toolkit 2 is pretty useless for this sort of thing.

Job2Bas_doc is a Quill doc file. PDF version attached to make life easier.

Bear in mind that it has to be run in SuperBASIC (or SBASIC) on the sending machine, i.e. you need a means to run QL BASIC programs to use Job2Bas to create the transfer files.

Once transferred, just LRUN the transferred file on the QL - it contains code to unravel itself. It might fail if the program being transferred is large enough to fill memory since a copy of the "created" program is built in memory alongside the running program - unfortunately the transferred prog will need Toolkit 2 (because of the use of ALCHP and RECHP extensions).

This can be largely circumvented by changing the ALCHP function to RESPR for QL without Toolkit 2 extensions, and removing the reference to RECHP. Once it' s run and saved the executable, you'll need to reset the QL as RESPR memory cannot be released from software.
Attachments
JOB2BAS.pdf
Job2Bas documentation
(19.93 KiB) Downloaded 70 times


Derek_Stewart
Font of All Knowledge
Posts: 3975
Joined: Mon Dec 20, 2010 11:40 am
Location: Sunny Runcorn, Cheshire, UK

Re: What is really in the header of binary files?

Post by Derek_Stewart »

Hi,

I do not really worry about file header, I transfer any QL software in compressed archive format usually ZIP, and uncompress the file in a QL environment.

If you do not have UNZIP software on the QL, then there is a version of unzip that will created the executable from a fil loaded into the QL memory.

Alternatively use a QL Emulator to save the files(s) to a MDV or FLP image and transfer that file and use MDI/FDI to read the image file, or write the files to a QWA container, store on a SD Card and use QIMSI, QL_SD, to read the files.

The file header process introduced by QL Emulators is a complete mess.

I personally use the Q68 to copy files to a QL via the QL Network.


Regards,

Derek
User avatar
bwinkel67
QL Wafer Drive
Posts: 1202
Joined: Thu Oct 03, 2019 2:09 am

Re: What is really in the header of binary files?

Post by bwinkel67 »

If you us the QLAY emulator, when using Windows directories mapped to the win device, it creates a qlay.dir file that contains the header information. There is an accompanying tool called qlayt.exe that will allow you to take a file without the header, and then simply add a header to it by specifying the dataspace size. This all won't sound very useful if you don't use QLAY as your primary emulator. However, even if that's the case, once you've added back the dataspace value, you can copy the file into an MDV file with QLAY and that can then be used by vDrive. So the executable on the MDV file is then correct.

Also, the BASIC SuperCharge compiler comes with a tool that will also put the dataspace header back onto a file.


User avatar
tofro
Font of All Knowledge
Posts: 2702
Joined: Sun Feb 13, 2011 10:53 pm
Location: SW Germany

Re: What is really in the header of binary files?

Post by tofro »

bwinkel67 wrote: Tue Dec 05, 2023 12:47 am If you us the QLAY emulator, when using Windows directories mapped to the win device, it creates a qlay.dir file that contains the header information. There is an accompanying tool called qlayt.exe that will allow you to take a file without the header, and then simply add a header to it by specifying the dataspace size. This all won't sound very useful if you don't use QLAY as your primary emulator. However, even if that's the case, once you've added back the dataspace value, you can copy the file into an MDV file with QLAY and that can then be used by vDrive. So the executable on the MDV file is then correct.

Also, the BASIC SuperCharge compiler comes with a tool that will also put the dataspace header back onto a file.
Even in-built SuperBASIC comes with tools that allow you to easily (somewhat) restore the header :) It's absolutely not a problem of missing tools....

The problem is that all this tooling can't know what the requirements of the specific software is that has lost its header - If you set it too low, the software might crash, if you set it too high, space is wasted.

The only real way around this problem is: Never unpack a QL zip on a non-QL file system. If you keep this in mind, the header is a non-problem.


ʎɐqǝ ɯoɹɟ ǝq oʇ ƃuᴉoƃ ʇou sᴉ pɹɐoqʎǝʞ ʇxǝu ʎɯ 'ɹɐǝp ɥO
User avatar
dilwyn
Mr QL
Posts: 2761
Joined: Wed Dec 01, 2010 10:39 pm

Re: What is really in the header of binary files?

Post by dilwyn »

tofro wrote: Tue Dec 05, 2023 7:19 am
bwinkel67 wrote: Tue Dec 05, 2023 12:47 am If you us the QLAY emulator, when using Windows directories mapped to the win device, it creates a qlay.dir file that contains the header information. There is an accompanying tool called qlayt.exe that will allow you to take a file without the header, and then simply add a header to it by specifying the dataspace size. This all won't sound very useful if you don't use QLAY as your primary emulator. However, even if that's the case, once you've added back the dataspace value, you can copy the file into an MDV file with QLAY and that can then be used by vDrive. So the executable on the MDV file is then correct.

Also, the BASIC SuperCharge compiler comes with a tool that will also put the dataspace header back onto a file.
Even in-built SuperBASIC comes with tools that allow you to easily (somewhat) restore the header :) It's absolutely not a problem of missing tools....

The problem is that all this tooling can't know what the requirements of the specific software is that has lost its header - If you set it too low, the software might crash, if you set it too high, space is wasted.

The only real way around this problem is: Never unpack a QL zip on a non-QL file system. If you keep this in mind, the header is a non-problem.
Yeah, preventing a problem is always better than trying to fix it afterwards. Learn to walk before trying to run.


User avatar
tofro
Font of All Knowledge
Posts: 2702
Joined: Sun Feb 13, 2011 10:53 pm
Location: SW Germany

Re: What is really in the header of binary files?

Post by tofro »

To answer the question in a sorted manner:
Robbizz wrote: Sun Dec 03, 2023 4:13 pm 1. First of all, I often find that Windows destroys the header of binary files as soon as you unpack a Sinclair QL .zip file. But only Windows? Linux, Mac doesn't do it?
Any non-QDOS file system will not be able to store the header without further precautions, thus destroy it. That is a generic problem.
Robbizz wrote: Sun Dec 03, 2023 4:13 pm 2. Does every executable binary file have a header? I often hear talk that "probably" if unzipped out of Sinclair QL it won't work. So sometimes there might not even be a header?
Every QDOS file on a QDOS file system has a 64-Byte header. The headers are (rather: may be) useful for any file type, but existentially needed mainly for executables (the ones you start with EXEC and the like). The header is basically a copy of the file's directory entry in the QDOS file system. If you count position-independant binaries that are loaded with LBYTES and executed using CALL as "binary executables", those don't need their header to work. In addition, there are some very exotic programs you probably won't come along for some time that use the "file type specific information" (see below) for other purposes.
Robbizz wrote: Sun Dec 03, 2023 4:13 pm 3. Then, following up with the questions: what does the header of binary files consist of?
From the QDOS/SMSQ Reference manual (emphasis mine):
Each file is assumed to have a 64-byte header (the logical beginning of file is set to byte 64, not byte zero). This header should be formatted as follows:

$00 long file length
$04 byte file access key (used by third parties software)
$05 byte file type
$06 8 bytes file type-dependent information
$0E 2+36 bytes file name
$34 long update date [EXT,DD2]
$38 word version number [DD2]
$3A word reserved
$3C long backup date [DD2]

The current file types allowed are: 2, which is a relocatable object file; 1, which is an executable program; and 0 which is anything else. In the case of file type 1, the first longword of type-dependent information holds the default size of the data space for the program.

For level 2 and level 3 devices, a type of -1 (or 255 decimal) stands for a subdirectory.
Robbizz wrote: Sun Dec 03, 2023 4:13 pm
It's really like this? I have doubts because by opening an executable binary file, where I managed to make it work with a real Sinclair QL (not an emulator), I opened that file with a hex editor and at the beginning of the file I did not find any of the codes exposed above.
You won't be able to read the file header with standard file open methods - The system will hide it from you. You need to use specific "Read header" calls to access it. Other than using specific toolkits (like DIY TK GetHead, for example), you will not be able to access the header directly from Basic.

In machine code, the IOF.RHDR trap will allow you to read a file header.
Robbizz wrote: Sun Dec 03, 2023 4:13 pm 3. From what I can understand (see point 3), the header of a binary executable file is used to inform the system of how much memory the system must reserve for that program (of course, correct me because this is my assumption). So, I thought, what if a command like:

Code: Select all

EXEC MDV1_NOMEPROG
It gives me as a result:

Code: Select all

BAD PARAMETER
It means that program has lost its header. Well, I said to myself; because if I can't get it to run with that command, I can reserve the memory myself with a few lines of SuperBASIC. If the program (for example Qterm) is 23172 Bytes large I can do it like this:....
First, the information in the first longword of the "file-type-dependant information" in the header has nothing to do with the program size, but rather its dynamic data requirements (the room reserved for the CPU stack and variables). The program size is simply the file size, but the size of the data requirements cannot be reconstructed from the file on disk once its lost.

To cut a long story short, you could load a binary into memory in the same format that EXEC does using ALCHP and funny POKES, but you simply cannot create a Job (in principle, a QDOS "process") from it using S*BASIC commands.

a much simpler way to try and fix lost headers is (using TK2 commands) to

Code: Select all

siz = FLEN (\<filename>)
mem = ALCHP (siz)
LBYTES <filename>,mem
SEXEC <filename>,mem, siz, <assumed size of data space>
<assumed size of data space>, well you need to assume it, because it's been lost, remember? Anything between 512 bytes and 32k might be reasonable, only very specific compiled S*BASIC programs have much larger data space requirements. You simply need to play around (well, that's why we're saying you should avoid losing that header in the first place...) until it works reliably and hopefully doesn't waste too much space in memory. After that, the program should at least be EXECutable again (but it might crash if your assumption was wrong).


ʎɐqǝ ɯoɹɟ ǝq oʇ ƃuᴉoƃ ʇou sᴉ pɹɐoqʎǝʞ ʇxǝu ʎɯ 'ɹɐǝp ɥO
User avatar
XorA
Site Admin
Posts: 1368
Joined: Thu Jun 02, 2011 11:31 am
Location: Shotts, North Lanarkshire, Scotland, UK

Re: What is really in the header of binary files?

Post by XorA »

The only real way around this problem is: Never unpack a QL zip on a non-QL file system. If you keep this in mind, the header is a non-problem.
If only people were writing tools to allow you to do this that work with all the major emulators?

https://github.com/xxoraa/qem-unzip


Post Reply