Search This Blog

Thursday, 21 October 2021

 Code Comprehension

Surely it's better to write this:

      romdata[addr-0x8000] = v1;
      romdata[(addr-0x8000)+1] = v2;
 

like this:

      romdata[(addr-0x8000)+0] = v1;
      romdata[(addr-0x8000)+1] = v2;
 

Ok, it's a bit more typing but it's easier to understand, surely?


Thursday, 14 October 2021

Code the Specification: 6303 Assembler in 2000 Lines of Tcl Using Regular Expressions

 Code the Specification:

6303 Assembler in 2000 Lines of Tcl Using Regular Expressions

 

I have been interfacing to the Psion Organiser 2 recently, firstly using a Raspberry Pi RP2040 to replace a datapack and provide SD card storage of pack images:

and also in the top slot of the organiser, which provides a way to add hardware to the organiser:


When adding hardware in this way the organiser allows the code that drives the hardware to be held on the hardware as an embedded datapack. This means that you never lose the drivers for a piece of hardware as it is built in to the device itself. I made a prototyping board using a raspberry Pi Pico running modified datapack code from the RP2040 project. The drivers for the hardware are usually written in assembly code as direct access to the hardware is required.

Datapacks that drive hardware usually add commands to the main menu of the organiser, and I wanted to do that for my prototyping hardware. To do this requires building a datapack image containing the command code plus the extra information that datapacks contain that enables the organiser to load the code.

Psion supplied an assembler in the 80s that was able to perform the assembly of code together with the extra features that provide the datapack image. This is still available but only runs under DOS, so must be run in some kind of emulator, or on a native DOS machine. I don't have a DOS machine and although I do have a DOS emulator and have run this assembler, I really wanted a version of the assembler that runs on Linux. There are a few 6303 assemblers about, but I couldn't find one that did all of the special organiser things, so I decided to write one.

An assembler is quite simple in that it takes assembly language instructions from a text file and builds binary object code. The datasheet for a processor usually provides a list of the instructions and the corresponding object code, so lends itself well to a method I call 'Code the Spec'.

The idea of coding the specification is to take some core information directly from the specification, copy it (cut and paste ideally)  and build a data structure that the code uses to do whatever it needs to do. In this case the assembly language instructions are copied from the processor datasheet and form an array of data that the code uses to parse input files.

The 6303 datasheet I found has a list like this:

 The assembly instructions are in the second column and the machine code is in a column per addressing mode. Addressing modes pose an extra problem as you have to recognise which addressing mode the instruction is using in order to determine the machine code that must be generated. I decided to use regular expressions to parse the assembly instructions as they are compact and powerful and can handle this parsing. The corresponding table in the code for the above instructions is here:


    {ADCA ____ 89   99   A9   B9   ____ ____ ____}
    {ADCB ____ C9   D9   E9   F9   ____ ____ ____}
    {ADDA ____ 8B   9B   AB   BB   ____ ____ ____}
    {ADDB ____ CB   DB   EB   FB   ____ ____ ____}
    {ADDD ____ C3.3 D3   E3   F3   ____ ____ ____}
    {AIM  ____ ____ ____ ____ ____ ____ 71   61  }

Well, most of the instructions are there, as the datasheet is a scan I couldn't cut and paste from it so I used an alphabetical list of 6303 instructions to create the array.  Taking the ADCB instruction as an example:

    {ADCB ____ C9   D9   E9   F9   ____ ____ ____}

the instruction to be matched is shown as ADCB and the opcode for each addressing mode is in a column in the same order as the datasheet. You can see that a cut and paste, if you could do it, would quickly create this table. The instruction length for each instruction is set as a default for each addressing mode as that usually fixes the length. there are always some exceptions, and the ADDD instruction has an instruction length over-ride by using C3.3 to specify a length of 3 bytes for this instruction.

The datasheet only shows the addressing modes for the instructions it is describing, so the array has more columns than the datasheet as it has all possible addressing modes.

Each addressing mode modifies the syntax of the instructions, so a list of regular expressions is used to both determine the addressing mode being used, and pull out the operands for each instruction.

these variables help with expression complexity:

set ::RE_EXPR "\[A-Z0-9a-z_$^\]+"
set ::RE_EXPR ".+"
set ::IMM_RE "^%s\[ \t\]+#\[ \t\]*(\[A-Z0-9a-z_$^\]+)"


set ::ADDMODE {
{"REL" 2 "^%s\[\t\]+($::RE_EXPR)"}

{"IMM" 2 "^%s\[ \t\]+#\[ \t\]*($::RE_EXPR)"}

{"DIR" 2 "^%s\[ \t\]+($::RE_EXPR)"}

{"IDX" 2 "^%s\[ \t\]+($::RE_EXPR)\[ \t\]*,\[ \t\]*(\[Xx\])"}

{"EXT" 3 "^%s\[ \t\]+($::RE_EXPR)"}

{"IMP" 1 "^%s\[ \t\]*$"}

{"XIM" 3 "^%s\[ \t\]+#\[ \t\]*($::RE_EXPR)\[ \t\]*,\[ \t\]*($::RE_EXPR)" }
{"XXM" 3 "^%s\[ \t\]+#\[ \t\]*($::RE_EXPR)\[ \t\]*,\[ \t\]*($::RE_EXPR)\[ \t\]*,\[ \t\]*\[Xx\]"}
}

Each addressing mode has the name of th emode, the default instruction length for instructions of that addressing mode and the regular expression that can be used to parse the data and also to decide if an instruction is using a particular addressing mode.

The assember can test each assembly instruction against a regular expression made up of the instruction name and then each addressing mode in turn. When it gets a match it can build the object code from the table entries.

The assembler is a multi pass assembler as the addressing mode

There are some complications, such as macros and code overlays. There are also various directives to define byte, word and string data, and the concept of 'pack address' which is the address that a byte is located at in the 'EPROM' of a datapack, which is different to the address defined using ORG directives.  The Psion organiser also requires relocatable code which is done using a 'fixup' list at the end of the object code.

The assembler supports all of these features in just under 2000 lines, and manages to assemble Psion example code with almost exactly the same object code. I say almost exactly as I found that the Psion assembler example I was using used a less efficient addressing mode for one of the instructions, but only one occurance of it. My assembler generated a more efficient equivalent instruction and so was one byte shorter than the Psion version.

 The Psion assembler list output (this is from the XDICT.SRC example):

 165   20B4 97 E2                           sta    a,flag:

 372   21F2 B7 00E2                         sta    a,flag 

My assembler outputs the first form once it knows that the data is zero-page. The colon at the end of the label may be a reson but I haven't found documentation of that syntax.

The assembler can also embed the object code in a C program as an array using embedded comments to mark where the data should appear. This is useful as the code needs to be in a C file where it is compiled and flashed to a RP Pico.

This example of 'Code the Spec' didn't use exactly the same syntax as the datasheet, for another example, have a look at the Z80 Arduino Shield

 
 
https://trochilidae.blogspot.com/2019/12/z80-arduino-using-mega-as-debugger-ever.html
 
The code is on github
 
https://github.com/blackjetrock/z80_shield/blob/master/software/disasm/disasm.c  
 
and the disassembler uses a table of this form:
 
"00rrr110 nn :LD r, n:",
"01rrr110 :LD r, (HL):",
"11y11101 01rrr110 dd :LD r, (y+d):",
"01110rrr :LD (HL), r:",
"11y11101 01110rrr dd :LD (y+d), r:",
"01rrrsss :LD r,s:",

 If you look in the Z80 programming databook you will see that the instructions are defined in exactly this form, and the table was copied from those definitions.

 

Sunday, 15 August 2021





How fast is a Modern Microcontroller?


At the time of writing, the Raspberry Pi Pico has just been released, so can be considered modern. How fast is it? Compared to the object of a recent project, the Psion Organiser II from the 80s, it is pretty fast. fast enough, in fact, to be able to pretend to be an EPROM in the Psion datapak.

Some Technical Psion History

The Psion Organiser II has two storage slots that were designed to hold data in a similar manner to hard drives on a PC. The sizes range from 16K up to 1Mb, although around 32K was a more usual size. The hardware in a datapak was simple: it was an EPROM. Later paks held RAM (and a battery) or flash for the very latest devices. Using more modern technology had a big advantage in that a UV eraser wasn't needed to clear the data on the pak. EPROMs had to be erased before allocated space could be re-used.

The storage devices in a pak are directly connected to a data bus on the slot connector, but the address bus was different. Due to a desire to reduce the number of connections required, the address bus on the storage device is attached to the outputs of a counter. To set up an address the clock on the counter is pulsed until the address is correct. Control signals then orchestrate the reading or writing of data.

Using this scheme the number of connections on a slot connector is kept to 16, even though datapaks up to 1Mb can be used. Larger address ranges would require a large number of clock pulses, slowing down the data rates, so larger datapaks use extra counters as page and/or segment counters. These control higher address lines and reduces the number of clock pulses needed to move to different addresses.

The Psion technical manual is available on the web and details all of the signals.

Pico

With the RP Pico arriving, I was very interested to see if I could use the programmable IO (PIO) feature to interface to older hardware. Creating a datapak for the Organiser II seemed to be one of those projects. The PIOs are small, fast processors or state machines that can perform tasks that are closely coupled to the GPIO lines on the Pico. As EPROMs (and RAM and flash) devices are usually found attached to processor buses they are inherently fast devices. The datapak doesn't run at high data rates, however, as the processor in the Organiser, a 6303, only runs at 900kHz. The interface code that drives the datapak slot signals drives them with no delays so, for example, the assertion of the slot select signal to read data is just three instructions:

Assert select

Read Data

De-assert select

With 3 or 4 cycles to execute these instructions we end up with a pulse width of about 200kHz or so.

This is well within the capabilities of the PIOs in the Pico, so I decided to go ahead and build a Pico powered datapak breakout board. this plugs in to one of the slots in the Psion and has an OLED display and some switches as well as level shifters (the Organiser is 5V, the Pico is 3V3). The idea is to present the RAM (or some of it) within the Pico as a datapak plugged in to the slot.

There's no commitment concerning GPIO assignment and the PIOs when creating a circuit with the Pico as the PIOs can use any GPIO, and can be disconnected from them entirely if code is to control them. So, I built a circuit before I had prototyped the method it would work under.


As it happened I had to make three boards as I messed up the slot connections on the first board (so it would only work upside down, not useful), and on the second board the level shifters I chose (YE08s) just didn't work. The third board using 74LVC245s worked perfectly. Well, the hardware did. As i looked at the PIO program that would be needed I started to realise that maybe the PIOs couldn't handle this interface. The problem was twofold:

1. The address counters were hard to implement as only one register (X) in the PIO could be changed by one in a PIO program. And it could only be decremented. Only decrementing wasn't too much of a problem, but there were also up to three counters in a datapak. Synchronising the address lines could be tricky.

2. The second problem was more of a show stopper. The address counters have to be combined and then be used to address the RAM buffer in order to get the data to be read or written. I couldn't see how to do this, which put a stop to me using the PIOs

Speed

Coming to the rescue, however, is the sheer speed of the processors in the RP2040 (the Pico processor). There's two cores running at over 100MHz and this is enough processing power to handle the datapak interface in firmware. I had to run the address counter handling on one core and the control signal handling on the other. I also found that interrupt latency was too high and have to poll the GPIOs in a loop. This means that the code has to run in two modes, one that is handling the pak protocol, and one that is driving a UI. This isn't much of a limitation as the Psion only talks to the datapak when reading or writing and actually powers the paks down when not using them. (This power down behaviour means that I have to power the Pico pak with a USB cable otherwise it is turned off between accesses).

After quite a lot of coding and interface investigation I managed to get this working and the code is capable of emulating a 32K datapak. With 235K of RAM on the RP2040 it shoul dbe possible to emulate 64K, and maybe 128K paks.

Space

The best form factor for a datapak emulator is, well, that of a datapak. So can the RP2040 fit in that footprint? Well, yes it can. The level shifters fit as well, as does a small OLED display and some buttons. In fact, the breakout board circuit can be shrunk down to fit in the datapak enclosure (holes for buttons and the display are needed, and for the USB connector).


For bulk storage I added an SD card slot so datapak images (in .opk format) can be read and written to and from the RAM buffer in the RP2040. That provides enough storage to easily store every datapak image I can find. Any of these can be swapped in to the RAM buffer and used as required. You can also write data to the RAM buffer and then write that to the SD card, providing almost unlimited memory to the organiser.


The wires are for programming the RP2040, they solder to pads on the PCB. The display is mounted 180 degrees from where I want it due to a PCB layout problem. The idea is to have the entire unit fit in the space of one datapak, which it should do when V2.0 fixes the issues on this PCB. 

The package used by the RP2040 turned out to not be a massive issue when hand soldering, using some extra solder and a hot air gun I was able to solder two devices with no problems and a third has some unknown issues which were solved by removing it and replacing with a new device.

The breakout board:


The datapak sized board:


The datapak form factor board, and the gadget plugged in to a model LZ organiser.




Saturday, 31 July 2021

 Transputer: Stack Based With OS in Hardware

Picoputer: RP Pico Hardware Transputer

 

The transputer was and still is an odd beast. It has hardware support for processes (hence OS in hardware, well, sort of), and its assembly language is such a pain that Occam is a much better way to program it. It's a language that is close to the hardware and allows parallel processing to be built in at a basic level. They have four fast (for the time) 10mbps (20mbps later) links that are used to communicate between devices and other systems. As the processors improved in capabilities and speed, the links remained compatible.

The later transputers have floating point in hardware, which makes them useful for computationally intense work, especially when configured in networks.

The only real downside is that the chips were expensive, so they never really made it into common usage in embedded applications. Quite a few parallel processing systems using transputers were made, though.

I read all about transputers when they came out but never had a chance to use any. Recently, though, I was looking at vintage processors and transputers came up, which triggered some memories. Unfortunately, the downside of high cost seems to still exist, and vintage transputers are quite expensive. Not as expensive as when they were new, but costly enough that I didn't just buy a few. 

If you want to run some meaningful code on a transputer then you also need to add some RAM, as the chip itself only comes with about 2K onboard. A common 'unit of computing' using a transputer is the TRAM (TRAnsputer Module) which is a transputer plus some RAM. These are very expensive to buy, to the extent that creating a system with more than one processor in it is just not economically sensible.

Raspberry Pi Pico

At around the same time, the Raspberry Pi Pico came to my attention. This is a modern micro-controller board that uses the RP2040 device, which is interesting for me as it has a set of four intelligent hardware GPIO processors. I'm think that these are very useful for interfacing to old hardware buses, such as the FX702P display bus I sniffed with a Blue Pill, or the FX502P external interface bus. When I implemented these projects I used firmware to interface to the bus, which was just about possible using the Blue Pill as it has a high clock rate relative to the bus. Interrupts were necessary in the case of the FX502P bus. The RP2040, though, has programmable hardware that can operate at frequencies of tens to a few hundreds of MHz. This, hopefully, should make it possible to interface to some devices that have higher clock rates.


 

While I was looking at the RP2040, it suddenly occurred to me that the four links on a transputer could be implemented using the eight PIO state machine son an RP2040. Each state machine handles data in one direction, leaving the processor(s) free for other work. What other work? Well, how about running an emulator of a transputer on the core? That would give you a hardware emulation of a transputer. How fast would it be? Well, the original transputers were running at about 20MHz, and the Pico runs at 135MHz. So it probably wouldn't run at the same speed as an original, but it would only be about an order of magnitude slower, maybe. And you can, of course, just add more transputers (real or emulated)  to speed things up...

The links that the Pico provides can easily run at the standard 10MHz link speed (10Mbps) and running at the faster 20MHz shouldn't be a problem either. In fact, if only emulated transputers are talking then a faster link rate could maybe be used.

Host Communication

The transputer links can't be attached to a modern PC, but INMOS made some link adapter ICs (The IMSC011). These are fairly easy to buy, and provide two 8 bit data buses, one for the LinkIn direction and one for LinkOut. Adding one of these to an Arduino would give a way to interface a transputer to a PC.

 

As these are devices that run off 5V I decided to use an Arduino Mega Embedded, partly because I had one. The parallel buses can be wired up to the Mega, together with the Valid and Ack signals. these are used to signal that the data is valid (when Valid is active) and also allow data to be acknowledged (by Ack). The Arduino can then do whatever is needed with the data. i decided to send the data over USB to a host PC as that is the arrangement that the transputer originally used. The host PC then runs a server that handles the 'SP Protocol' which allows input and output on a terminal and keyboard and also allows access to files in the file system.

Booting

A transputer can be booted either from ROM or from a link. I didn't want to boot from ROM, although a program can easily be stored in flash and executed at startup. It's more flexible to book from a link as the host can then supply the code which can be compiled Occam, or C, or any of the other languages that can generate transputer object code. I'm particularly interested in Occam.

Booting from a link is built in to hardware and involves sending a small (up to 255 bytes) bootstrap loader. This then executes and loads further data (the boot loading phase). That boot-loader then loads more chunks of code over the link.

For the host, this is all rather simple, all it does is send the boot file to the transputer link. The format of the data is set up to drive the three stage boot process.


Using PIOs As Transputer Links

The transputer links use a protocol that is very similar to asynchronous serial data. You can view data packets as having a start bit, a type bit and eight data bits followed by a stop bit. An ACK packet follows much the same format, except the data bits are missing. The type bit is 1 in a data packet and 0 in an ACK packet. I started with the serial UART PIO code in the Pico examples and adjusted it to use the transputer protocol. I have a bit of work to do concerning the ACK packet, as I treat the ACK packet as a 10 bit frame at the moment, just with trailing zeros. This could possibly lead to problems if serial data is sent within 7 bit times of an ACK packet, but is working OK for now.

I used one PIO for LinkOut and one for LinkIn, and for the prototype I generate a 5MHz clock with a PIO for the IMSC011 ClockIn pin. 


 Once fired up and wired up this PIO code was capable of driving the IMSC011 and successfully sending and receiving data.

Host Code

The host code will eventually run an SP protocol which will give the full range of IO and file access. For a first pass, just a simple display of data coming in to the host was implemented, as a test of the links. The Arduino Mega sends the link data over a simple (and inefficient) protocol over USB to the host. Using this setup I was able to run a simple hello world program I found on the internet and have the text appear on the host once the Picoputer was booted.


I toyed with the idea of compiling the INMOS server tools

Real Hardware

Just for interest, I also ran this prototype set up using a real transputer that I managed to source. It's actually a motherboard for a larger system, but it has a transputer on it. I applied power and then reverse engineered some signals (most importantly the BootFromROM signal had to be de-asserted). Once this was done, the host code booted the hello world program which then ran and resulted in the 'Hello World' display.



Running Occam on the Pico

While running precompiled binaries is fine for a test, what I'd like to do is run Occam on the Pico. This turns out to be tricky as I can't find a compiler that runs on Linux and, most importantly, generates transputer object code. There's the KROC and the SPOC compilers, but they generate machine code for other processors. About the only option seems to be the original INMOS compilers. They, however, run on older operating systems, DOS being the one for the PC hardware platform. 

I have found a useful VM image on the geekdot website which allows me to run one of the INMOS compilers, so I can actually compile Occam, and then link and collect it down to transputer machine code. At the moment I'm copying the files on and off the VM using a virtual floppy disk image file. Not hugely convenient, hopefully I will be able to compile on a transputer, or maybe recompile the compiler for Linux.

I found a simple Occam program, which looks like this:

#INCLUDE "hostio.inc"  -- contains SP protocol
PROC simple (CHAN OF SP fs, ts)
  #USE "hostio.lib"
  [1000]BYTE buffer :
  BYTE result:
  INT length:
  SEQ
    so.write.string    (fs, ts,
                            "Please type your name :")
    so.read.echo.line  (fs, ts, length, buffer, result)
    so.write.nl        (fs, ts)
    so.write.string    (fs, ts, "Hello ")
    so.write.string.nl (fs, ts,
                             [buffer FROM 0 FOR length])
    so.exit            (fs, ts, sps.success)
:


This uses the 'SP Protocol' to perform input and output using the host system. After some fiddling (porting a third emulator and writing some SP protocol functions and various bug fixes), the host system displays this:

 

Port name:/dev/ttyUSB0
Bootfile:SIMPLE.BTL
Serial port OK
Sending boot file
Boot file sent
Please type your name :AndrewwHello Andrew

The key line here is the last one. The Occam program prompted for my name as it should, then I typed my name in and it displayed the result. OK, the newline is a 'w' and it took a few seconds to run, but it ran. The '.btl' file (BooT Link, or object file) for this program is 3935 bytes long, so a sizeable chunk of code that was loaded using the three stage bootloader mechanism. No bad opcodes, either.

The whole arrangement is not optimised for speed at all, it is optimised to get it working, so running this program does take a while. With some changes I should be able to get a binary transfer of code running, hopefully that will be faster. The emulator could be sped up a little, perhaps, but when single stepping it doesn't seem to be particularly inefficient.

The RP2040 is dual core, which means that I could run an emulator per core and have two transputers on the one board. Due to some excellent design, the transputer link architecture makes no distinction between hardware links and internal communication links, the code has no idea what it is dealing with. this should make it easy to set up communication between the two cores over links.

But, it works!

 



Friday, 16 July 2021

 Reverse Engineering an LCD Display

I have a DAB radio which started to irritate me in various ways. Finally fed up with it, I bought another radio and decided to free up the space that the old radio takes up by tearing it down.

This is the DAB radio taken apart:


 

Nice display 16 char 2 line dot matrix.


Looking at PCB, it seems to have a three wire interface:


Scoping up the lines, one looks like a reset:

Others have data and clock:


Looking at the data, there's a different voltage level in there. I2C has an ACK which comes from a different computer to the one generating the data, so different level makes sense.

As this is probably I2C, switching on the decoder in the scope shows what is being sent:


Now we know the I2C slave address and some data. It does look like I2C. What device is it though? Searching the internet gives a few options. It could be a standard I2C GPIO expander attached to an LCD controller.

After power up there's an initialisation sequence:

That's the first byte in a longer sequence that goes:

0x39, 0x14, 0x7f, 0x57, 0x6b, 0x0c, 0x01,0x06, 0x38, 0x40.

There also seem to be two bytes for every bte of data transmitted, the initialisation bytes have a ledaing byte of 0x00, the data bytes have a value 0f 0x40.

Using this sequence we could maybe find the controller that is used. After more searching there's a possible match. The Winstar WO1602I-TFH- AT module has an initialisation sequence that is similar.  The datasheet  shows the initialisation sequence to be:

0x38, 0x39, 0x14, 0x74, 0x54, 0x6f, 0x0c, 0x01

This is similar enough to probably be the same controller with a slightly different LCD attached. The datasheet is useful. It says the controller is an ST7032, and the first byte of the two that are sent is a control byte. It holds Co and RS. RS is the command/data bit, while Co is a continuation bit that allows more than one control byte to be sent. The datasheet shows the RS bit as bit 6 which matches the trace on the scope. It looks like this is indeed the controller in the display.

Excellent, we have enough data to probably drive this display.