Hi all,
I've used flash a lot, but mostly from the hardware side or using ready made tools to flash newly assembled boards for testing or firmware updates.
I love seeing you talking about finer details of flash programming. I'm aware of the timing requirements and flash's feedback mechanism when it's ready to take some more data. That always seems to be handled in the software realm by polling.
I think the big thing to sort out is the mechanism for loading and executing code that either stops the OS task scheduler and disables interrupts, or both? Also, I'm not clear, though I am sure many here are, if it is possible to return to the operating system after this step or if a restart is 'required'...
As a random aside (woohoo for those paying attention!) today I fired up a new proto board with 8MB RAM and 2MB flash, using equations that ignore the flash and read the QL's OS instead. It'll be like that until there is some way to do in system flashing. It's the first flash that's the challenge. I could pre-program the flash, but it's TFBGA. Adapters are VERY expensive. Aaaanyhow, 7.5 MHz <-> 30 MHz clock switching is working perfectly on two boards now. Runtless (absolutely critical) switching is down. There is a switch to force 7.5 MHz only for those that want it. It's still much faster than a BBQL due to the 16-bit bus. This board exists to give me easy JTAG access to reprogram the CPLD and test points to measure timing of CPU cycles with 68020 or 68030 on daughter cards. The 68SEC000 takes the bus from the 68008 on start-up. Then it can take the 2nd 2nd CPU out of reset, which requests and is granted the bus. That lets me check and work on equations to optimize timing of the RAM subsystem for every CPU. One quirk of the RAM is that it has 512x16-bit pages. Accesses within pages take 45-47 ns. Crossing a page boundary the next access takes an additional 24 ns. This makes no difference for any 68SEC000 up to 86 MHz. However, on the 68020 and 030 it can start providing improvements from 40 or 45 MHz on. The logic checks A9..21 against a stored copy of A9..21. If they match it asserts /DTACK immediately. If the number changes (a page boundary has been crossed) it stores that number. It adds one to the /DTACK countdown counter, which exists because I need a cycle and /DTACK counter for the finite state machine for handling byte swapping with the BBQL. That logic also allows the CPU to continue, unless it tries to access the QL again before the state machine has concluded its cycles. At that point it just holds /DTACK from the 68SEC000, which isn't connected to the QL until the FSM finishes there. This QL-facing logic allows me to run the CPU at 7.5 MHz during the QL cycles only, but will allow 30 MHz cycles on the QL bus. Now, I can feel Nasta's back tensing up.

I have used resistor-capacitor-inductor filters to slow the transitions to a QL-friendly slew rate. In practice, the CPU latches the data in a write cycle and goes about its business. Read is a work in progress. I'm delaying /DTACK but currently the counter is clocked at the CPU's running speed instead of the /4 speed. I plan to make that a jumpered setting.
As another random aside, A subset of this board could be used as an upgrade to Gold Cards. 8MB or 15.5MB of RAM and dual speed 32 MHz and 16 MHz. I haven't tried it, but removing the DRAM and preventing the refresh signal reaching the CPLD should be the only changes needed. Maybe. Either way, the DRAM can be shadowed so no matter.