Fun things to do with an MC68EC020....

Nagging hardware related question? Post here!
Post Reply
User avatar
Dave
SandySuperQDave
Posts: 2778
Joined: Sat Jan 22, 2011 6:52 am
Location: Austin, TX
Contact:

Re: Fun things to do with an MC68EC020....

Post by Dave »

Cool answers everyone. Very enlightening. Thank you.


User avatar
Peter
Font of All Knowledge
Posts: 2003
Joined: Sat Jan 22, 2011 8:47 am

Re: Fun things to do with an MC68EC020....

Post by Peter »

Hi Paul, Nasta, Dave,

since this thread is namend "Fun things to do with an MC68EC020" I have a funny and very unusual idea for you to think about. :idea:

A bit like an intellectual game, but could turn out more practical that it looks at first. Allowing the most simple, yet 32 Bit, QL 68EC020 hardware ever. :D
No dealing with shadow RAM. No need for 32 Bit wide ROM. Easiest OS updates without need for Flash. Most simple address decoding you can imagine. And even supplying mass storage.

1. Simply make all areas 32 BIt wide SRAM except $1C000 to $28000, which you make 8 Bit wide QL style bus.
2. Provide 8 KB of primitive 8 bit ROM somewhere between $1C000 and $20000 (No ROM at $0!)
3. Provide 4 simple I/O pins (3 registered out, one input with pullup) connected to an SD card socket.
4. Place 4 pullups on some specific data lines (and maybe pulldowns on the other data lines)
5. Provide a register that disables SRAM chip select on powerup, and can be set later.

The major trick is: On the first two 32 Bit wide CPU accesses to SRAM area at address $0 and $4, the pullups will "feed" the 68EC020 with an invalid SSP, but meaningful reset vector to a loader in ROM, which then enables SRAM Chip select. The loader sets SSP, fetches the ROM binary (Minerva plus storage driver, e.g. Qubide or QL-SD) from SDHC card, copies it to $0 in SRAM, then loads new SSP from $0 and jumps to the new reset vector at $4.

Did I miss something? What do you think? :D
Peter
Last edited by Peter on Fri Mar 17, 2017 11:54 am, edited 1 time in total.


User avatar
Peter
Font of All Knowledge
Posts: 2003
Joined: Sat Jan 22, 2011 8:47 am

Re: Fun things to do with an MC68EC020....

Post by Peter »

PS: Even if you leave it at QL clock speed for simplicity, I estimate that system would be as fast as a Gold Card.


User avatar
tofro
Font of All Knowledge
Posts: 2702
Joined: Sun Feb 13, 2011 10:53 pm
Location: SW Germany

Re: Fun things to do with an MC68EC020....

Post by tofro »

Peter wrote: The major trick is: On the first two 32 Bit wide CPU accesses to SRAM area at address $0 and $4, the pullups will "feed" the 68EC020 with an invalid SSP, but meaningful reset vector to a loader in ROM, which then enables SRAM Chip select. The loader sets SSP, fetches the ROM binary (Minerva plus storage driver, e.g. Qubide or QL-SD) from SDHC card, copies it to $0 in SRAM, then loads new SSP from $0 and jumps to the new reset vector at $4.
Peter,

I've seen a design from a russian guy who pushed the 68020 that had RAM at $0 in 8-Bit fetch mode on reset using an AVR , did override the clock with a port line and quasi single-step fed the reset stack and address into the M68k using some other AVR port lines, simply living with the bus collisions. And, apparently, that worked...

Tobias


ʎɐqǝ ɯoɹɟ ǝq oʇ ƃuᴉoƃ ʇou sᴉ pɹɐoqʎǝʞ ʇxǝu ʎɯ 'ɹɐǝp ɥO
Nasta
Gold Card
Posts: 443
Joined: Sun Feb 12, 2012 2:02 am
Location: Zapresic, Croatia

Re: Fun things to do with an MC68EC020....

Post by Nasta »

This is basically how the SGC works, but it manipulates the decoder. Without some sort of mechanism like this, it is impossible to implement different OS's because the exception vectors and vectored OS routines need to be changed when the OS is changed. There are many mechanisms to implement this. As far as I know, the GC/CGS does this by initially decoding it's on-board ROM at address 0, where the initial PC points to an alias of the same ROM where it will always be once the system starts. In other words, it disables it's address decoding and just aliases the ROM all over the address map.

Once the loader starts, the 'real' address map is established by enabling the decoder. This can be either done by detecting access to the actual decoded ROM address (usually by simply detecting an address line going high) or an instruction has to write data to a certain address to switch on the decoder. The latter method may have some advantages. I think the GC uses the former, the SGC the latter method.

Of course, the decoder decodes the 'boot' ROM exactly where the code is now executing :) (otherwise it would all go belly-up). It also decodes RAM at 0, exactly as Peter said, save for the IO area ($18000-$1BFFF) and some special consideration for the original screen.

The next step is testing the RAM at 0 (remember that the OS is not up at this time and no tests have been made). Normally the OS does not expect RAM here anyway so some testing has to be done, because of the next step, which is copying a ROM image to RAM - so you must insure that the RAM where the OS is going to sit is working properly.

At this point one might ask, how do we get to an image of an OS? For one thing, we know we have some sort of ROM storage from which we are just now executing code. So, this is one option, if the size is large enough. We also know, that sitting at $0000..$BFFF, there is the QL ROM (remember, I am talking here about a system like the SGC). Also, if we were clever enough :) the much larger address map of out upgraded CPU might enable us to have a whole alias of the 1M QL address map, somewhere else in the address map, at some high address, so the 1M of address space we would expect at $000000..0FFFFF, also appears to the CPU at say, $F00000..$FFFFFF. This means we can get at anything on the QL bus simply by offsetting the address by $F00000. So, there would be a QL ROM at $F00000..$F0BFFF.
In fact, since we are not any more looking at a bare QL, we can use some parts of the original 1M of QL address space for other things. Like, a boot ROM, or in fact, a much larger Flash chip, which not only holds the boot code, but also several OS images and add-ons.
This latest option is the most flexible, as it involves copying an OS image from whatever storage to RAM, perhaps with a method of choosing which one. The first option is actually a form of the third option, with one difference - the boot ROM which may also contain an OS image or images is only visible temporarily, until the OS is copied to RAM. The SGC uses a variant of the third option, the first 256k of the QL's address map also appears at $400000, this is where the SGC boot code finds the QL ROM (whatever version), copies it into RAM at $0, and patches to run with the extended hardware).

The second option is a bit more complex because it requires one extra step, and that is reconfiguring the decoder to access the QL bus at $0000..$BFFF when reading (finding the OS ROM there) and accessing the RAM when writing.
In all options the boot code then next the OS image to RAM starting at 0. In version 2 above it is slightly interesting because it involves reading and then writing to the same address :)

The penultimate step is to reconfigure the decoder to write protect the portion of RAM where the OS copy is now situated. In reality this is usually the first 48 or 64k. This seems to be mandatory, apparently some software is known to attempt writes to low addresses.

The final step is, either a soft reset (no external hardware is reset, otherwise the whole procedure would be triggered from the start), which resets the CPU, which then starts executing from the RAM copy of ROM as usual, OR, the boot code simulates a reset by loading the initial SSP and jumping to the initial PC.

THe careful observer would notice that I have referred twice to a method of changing the behavior of the decoder. At the very least, to write protect the first 48k of RAM. It is possible to implement this in a 'one-time-only' manner, which returns the initial state only on power-on reset, but is it more flexible to implement this using some sort of simple 'write only' register(s) sitting at an address somewhere in the rest of the IO area ($18000..$1BFFF), there would be 3 basic states - power on reset (boot code at 0), RAM load (RAM at 0, read-write), and RAM protect (RAM at 0, read only). Making it possible to change the state at any time enables loading a new OS from software. This is how SMSQ/e is loaded on the SGC.

In Peter's scenario, the OS store is a SD card (connected in SPI mode) which is a nice possibility because it enables management of OS images externally. However, boot code is not trivial. It may well be a version of the OS (eg. Minerva) with a boot program (in sbasic?) to find, chose and load OS images and extensions. The only slight disadvantage is that the storage is fairly slow as it is entirely software driven, but since booting is basically a 'one time' thing, this is not a big problem.

There is a hardware consideration, and that is the size of the boot loader ROM. Today it comes down to choosing an easily available and cheap flash chip, and usually what you end up with is a 29F040. By QL standards it's huge - half a meg - but the smaller ones are not much cheaper, and also may be slower. However, given that we have an upgraded CPU, there is certainly address space for it. But then, there is more than enough space for several OS images including something like SMSQ/E in it to begin with :)
Interfacing an SD card as a SPI device is dead easy (hardware-wise) but needs a driver of some sort, on the other hand having a means of choosing the OS image to be loaded also requires software.

An approach that is simpler software-wise, but more complex hardware-wise, would actually map a selectable 64k portion of the flash to the first 64k. This is really still only viable if it is actually copied into 32-bit RAM, or run from 32-bit flash. Also, management of the flash is more difficult. For one, you cannot program a flash chip while it is being read simultaneously for code execution, in other words, we come back to running the OS from a RAM emulation of ROM.

The impact of a wide bus on the speed of the CPU cannot be underestimated. Peter is right in expecting a 3-4 fold speed increase even at QL clock speeds only on account of a wide bus and improved CPU architecture. Remember that QL software heavily relies on OS code, which means it is being executed all of the time. Doing that from narrow memory will severely reduce perceived speed of the system.


User avatar
Peter
Font of All Knowledge
Posts: 2003
Joined: Sat Jan 22, 2011 8:47 am

Re: Fun things to do with an MC68EC020....

Post by Peter »

tofro wrote:I've seen a design from a russian guy who pushed the 68020 that had RAM at $0 in 8-Bit fetch mode on reset using an AVR , did override the clock with a port line and quasi single-step fed the reset stack and address into the M68k using some other AVR port lines, simply living with the bus collisions. And, apparently, that worked...
Also a nice trick, but far more difficult to implement.
Nasta wrote:This is basically how the SGC works, but it manipulates the decoder. [Explanation of SGC snipped]
Ahem... the whole point of my idea was not to implement decoding, mapping and bus sizing as complicated as the SGC. ;)


Nasta
Gold Card
Posts: 443
Joined: Sun Feb 12, 2012 2:02 am
Location: Zapresic, Croatia

Re: Fun things to do with an MC68EC020....

Post by Nasta »

Peter wrote:
Nasta wrote:This is basically how the SGC works, but it manipulates the decoder. [Explanation of SGC snipped]
Ahem... the whole point of my idea was not to implement decoding, mapping and bus sizing as complicated as the SGC. ;)
I know, but you have to implement 90% of it or more, anyway. I.e. once it gets into a small PLD like a GAL, it's very little work to implement the rest of it.
The very minimum you have to do (assuming you want legacy QL hardware for IO, i.e. J1 bus), is decode slow $18000..$1BFFF, and a bit more complicated access to screen RAM at $20000..28FFF or ..2FFFF (for two screens). You have to have a ROM decode for boot (or anything else including OS), also the port you need for SD card access (which is as easy to implement in 3 bits as a bit more) and full 32-bit RAM decode including byte select for RAM. Adding 10% more trickery on top of that is very little work.
The only difference is putting pull-up/down resistors on the bus in order to generate an initial PC (which can just as well execute code to load the SSP so the initial SSP vector can more or less be ignored anyway, interrupts will be disabled anyway). SInclair himself would probably opt to save on resistors since you really need 32 of them :P


User avatar
Peter
Font of All Knowledge
Posts: 2003
Joined: Sat Jan 22, 2011 8:47 am

Re: Fun things to do with an MC68EC020....

Post by Peter »

Nasta wrote:I know, but you have to implement 90% of it or more, anyway. [...] The only difference is putting pull-up/down resistors on the bus in order to generate an initial PC
I don't think so. I was also refering to ease of understanding, not just chip count.

There is only one single, fixed, continuous $10000..$27FFF "slow QL area", where 8 bit sizing is enforced and the QL can "see" the access. ROM/SDHC simply treated as QL peripherals, speed doesn't matter for a single copy.

All the rest is 32 Bit sized "fast SRAM area", including $0. Simply decoded by one inverter.


User avatar
tofro
Font of All Knowledge
Posts: 2702
Joined: Sun Feb 13, 2011 10:53 pm
Location: SW Germany

Re: Fun things to do with an MC68EC020....

Post by tofro »

I was always under the impression that the whole copying ROM across, patching and paging and shadowing was only done for copyright reasons and could be made much simpler today when a 32-bit ROM could be temporarily paged in at $0 during the first and/or all RESETs and actually live way up in the address space.

Tobias


ʎɐqǝ ɯoɹɟ ǝq oʇ ƃuᴉoƃ ʇou sᴉ pɹɐoqʎǝʞ ʇxǝu ʎɯ 'ɹɐǝp ɥO
Nasta
Gold Card
Posts: 443
Joined: Sun Feb 12, 2012 2:02 am
Location: Zapresic, Croatia

Re: Fun things to do with an MC68EC020....

Post by Nasta »

Peter wrote:
Nasta wrote:I know, but you have to implement 90% of it or more, anyway. [...] The only difference is putting pull-up/down resistors on the bus in order to generate an initial PC
I don't think so. I was also refering to ease of understanding, not just chip count.

There is only one single, fixed, continuous $10000..$27FFF "slow QL area", where 8 bit sizing is enforced and the QL can "see" the access. ROM/SDHC simply treated as QL peripherals, speed doesn't matter for a single copy.

All the rest is 32 Bit sized "fast SRAM area", including $0. Simply decoded by one inverter.
Again, I understand, though in there you need to decode the SDHC and the boot ROM and a means to write protect the first 64k (don't forget this, GC and SGC both had problems with this) and in fact a means not do enable the RAM initially at all because the CPU will otherwise read the initial PC from un-initialized RAM.

Normally, you could treat this area as a QL peripheral and do the decoding there, but then you are building two things :) but it is possible that the 'disable RAM at power on' bit may make that clumsy, even though in theory, you could implement DSMCL to do something like that (and explaining how DSMCL morks even on a normal QL is a challenge, it is in fact a 'negative decoder' of sorts).

Also, you still need to do write enable decoding with something like a GAL as it's by far the easiest way. But once you do this, you are in the realm of and-or logic, and then you get the capability to do more complex decoding for no extra cost in hardware or understanding how the decoding is done. You DO need a bit more understanding of what decoding IS and how to cleverly use it, which I suppose is of interest to the developer, especially since using a GAL approach enables an incremental approach - you can start simple and then refine the decoder, but you do need to think a bit in advance, in order to provide all the necessary signals to the GAL.

Arguably, it's not just down to chip count. Decoding 96k out of 16M, and then further sub-dividing it into 16k in the middle, by standard logic, inevitably means a number of chips. It is true that once you have that, you do indeed need only one inverter to decode RAM, as the RAM decode is essentially 'NOT $10000...$2FFFF' :)
You still need to decode byte write enables (BTW this works independently and parallel with 8-bit bus sizing even for 32-bit shadow access, bus sizing makes it so simple that only decoding is affected), not trivial done in standard logic.
And, you still need the 'slow down' mechanism getting the 020 clocked at even the QL rate to work reliably on the QL bus (buffering/latching addresses and data, DSL generation, and then delaying DSMCL and making the appropriate DSACKL signals from it). This one is already more difficult to understand than even fanciful decoding, and you need to have the same logic if you even need a single byte of the QL screen reliably. BTW that part is simpler if you only need to write the screen (so one more argument for shadowing).

This is what I meant with the '85% you need to do anyway'.


Post Reply