Wammes’ kolom
MCCW nummer 91, januari/februari 2000
Terug naar inhoud
Dit artikel is helaas alleen beschikbaar in het Engels.
part 1/23
23 steps to high resolution on MSX1

Someone once told me watching 8-bit-demos is like watching a magicians show: you know he’s just fooling the eye all along, but you never know how. But the similarities between magicians and coders go beyond that, just like magicians coders seem to think if they reveal how their effects were coded it’s not magic anymore. A real coder will lie, cheat and deceive you all in the name of fame for the impossible stunts he’s pulled out. A coder who falls beyond this just couldn’t be a real coder at all, left alone code for a real demogroup. But I’ve really never felt like that anyway.

Antti Silvast
Moving on from block city to hires-euphoria
More & Faster
Let’s talk physical

1. http://www.komkon.org/fms...

Small dictionary
retrace The period where the electron beam is refreshing the screen.
Occurs at the rate of 50 Hz/60  Hz
vblank The period where the electron beam drawing the screen is moving from the bottom of the screen back to the top of the screen. Generates an interrupt. Occurs at the rate of 50 Hz / 60 Hz
ram Normal ram you can access with normal Z80-memory accessing commands
vram ram-memory specifically dedicated to the video processor
vdp video display processor
sprite A hardware generated image that gets drawn in top of your pattern-based graphics with one transparent colour.
bob A software sprite
NOP No OPeration: a Z80-directive that does absolutely nothing except takes time
raster split When the screen is outputted in several frames you get a raster split: the other half of the screen is in different phase than the other
double buffering A method where you show one screen-buffer and draw to an other hidden virtual-buffer and then swap these with each other

The first thing you must realize about coding 8-bit demos is that the math involved isn’t really that complicated. Still checking out those 1240 texturemapped polygons on screen and wondering how on earth did those 8-bit-wizards manage to optimize it all to run 50 Hz on a 1 MHz machine realtime? Well, I’m sorry if I’m the first one to tell you this, but they did not. Nothing at all on 8-bit demos is neither very realtime nor optimized. Or maybe it’s realtime enough to have a realtime screen dumper, but that’s pushing it. If any optimization at all was performed, it was the precalc that was optimized in size and speed, not the actual code. That’s why I won’t go into any depths of mathematics on these articles. I’m not going to tell you how to write your own 3D-engine on the MSX because I’ve never written one myself. You must realize it doesn’t really matter how fast the precalc-algorithms are, since all they really do is precalc. Of course the faster the routine, the less time you’ll spend waiting.

     So what’s the catch? Why don’t anyone start coding their own animation-showers for the 8-bits pretending the stuff is very realtime and making people believe they’re actually very skilled coders? Well, first of all, nobody’s still converted Java(tm) to MSX and most of the wannabe industrial programmers are just too lazy to learn any language not attached with the stamp “commercially profitable”. Now who on earth would hire an expert in hacking an over 15-year old computer with an 8-bit-processor already long out of production? Second, it’s not all that easy. You see, the real problem is not coming up with the actual precalced effect, but outputting the data to the screen. For a 32-bit-programmer this will seem simply ridiculous. Picture the scene where you could perfectly render a realistic Doom on your 64×128 offscreen buffer at the rate of 20 Hz, but simply couldn’t draw the darn thing to screen fast enough. We’ll, that’s almost excactly the case with 8-bits. So when the closing 3D-scene from C64-version of 2nd Reality starts rolling, both the superhuman pc-coder and I will be gasping, but for different reasons: He’ll be going ‘how do they draw all those 1024 polygons realtime?’ whilst I’m thinking ‘To hell with the polygons, that has got to be an animation. But how on earth do they output it that fast?’.

I must say I’m not at all familiar with other 8-bit-systems than MSX. I’ve never coded on C64 or Spectrum or anything even during my Basic-days. So when ever I’m talking about 8-bits that strictly based on my limited knowledge on other 8-bit-systems and somewhat less limited knowledge on the MSX.

     Still I’d be willing to say, that MSX is an extremely awkward machine to code in hires. For starters, you don’t have real vram. Oh yeah, there’s a whole lot 16 kB of it, but it’s not mapped to your regular ram at all. Instead the vram is accessed through ports, one byte at a time; a feature that for reasons beyond my understanding also seems to be implemented on Sega’s Master Sytem, Game Gear and Megadrive-series. The profit of having real vram mapped to your real ram is obvious: speed. On the darker side you do lose some memory, but really, even 48 kB ought to be enough for everybody. Or maybe organise the memory in several 64 kB banks one being the video memory.

     Anyhow, we’re stuck with this feature and that’s it. The real problem is really not just writing or reading, ram-access isn’t that fast either, but you always read or write the data in the same order. The vram-pointer increases every time you IN or OUT the port; so it’s easy to write 512 sequential bytes to vram, but just as slow writing 256 bytes using every other memory place and filling the gaps with blanks. Or picture having several bobs rotating around the screen. With real vram it would be easy to make the effect fullscreen just outputting the bobs to the desired areas on screen buffer. It’s of course possible to use the same method with MSX, but you’ll have to set the vram-pointer again every time you start to output a new bob. Setting the vram-pointers requires two OUT-operations, so you can easily see, that whilst the number of bobs increases, MSX is left further and further behind of it’s 8-bit-cousin with real vram.

     Second, you can only write during the vblank. If you try to output too fast during the vretrace, the data will become distorted. Of course you could stuff you’re code with NOP’s, slowing it down intentionally, but hey, the MSX is no monster with it’s 3.5 MHz and slowing down even more from there would be just plain laughable. So we’ll have to make our output during the vblank and that’s it. In practise I’ve found out that 2048 writes is just about the maximum value you can output, if your inner loop is optimized enough. On SCREEN 2 the used amount of vram is 12 kB, so it would take six vblanks just to refresh the entire screen. That wouldn’t be so bad if you could doublebuffer, but for the entire screen you can’t thus creating six raster splits. So summa sumarum: fullscreen hires effects can’t be created on the MSX. But it also applies that you should never say never, since in a future article I will show you how to make fullscreen hires-effects by using character based-graphics.

     Let’s consider the 2048 bytes once more. SCREEN 3 uses just 1536 bytes for the onscreen data and if you want to take the easy way, you could stick with your legosized blocks for the rest of your life. And there’s nothing wrong with that, the pc-sceners will be very impressed even if your effect is running with 4×4-pixels in fullscreen. I just tend to think that they’ll be even more impressed when the same effects is running with 1×1-pixels and fullscreen. But maybe that’s just me.

Moving on from block city to hires-euphoria
256×192×16 colours. A luxury and a big screenmode for such a small machine. Let’s for starters pressume we could change the colour of each pixel. 16 colours means 4 bits per pixel, so the entire screen would take up memory 256*192 div 2=24576 bytes. Wouldn’t that be sweet, but our MSX only comes equipped with 16 kB. We’ll have to cut it somewhere and someone decided to reduce the amount of colours per character. Let’s say we had 2 colours per 8 pixels in width and calculate again: 12288 bytes and that’s excactly how it is implemented on the MSX.

     Using the graphics like this takes some extraordinary efforts and that’s why graphics modes in the MSX are nothing more than text modes. That is, you have a ‘font’: a set of 8×8-characters with non-fixed patterns and colours. On these articles I will refer to the character forms as patterns and the table of all the character patterns as pattern table. One pattern row is 8 pixels stored in 8 bits which equals to one byte. One character takes up 8 bytes in the pattern table. The other attribute defining the character is its colour, stored in the colour table. If we’re in SCREEN 1, the colour only takes up one byte, so the entire character is the same colour. In SCREEN 2, each character row has it’s own colour, so one character takes up 8 bytes in the colour table. The colour is always calculated as ‘16 * background_colour + foreground_colour’, where background-colour is used for the pixels on character pattern stored as 0b and foreground-colour for the ones stored as 1b. Now we have our ‘font’, but we still need to know how to place it on screen. Enter our ‘text’, the name table. The name table is a 32×24-sized table with one byte denoted to each ‘letter’ on screen, the ‘letter’ being 8×8 pixels.

     One character set contains 256 characters, that is, a pattern table of 256*8=2048 bytes and a colour table of the same size in SCREEN 2 or 256 bytes in SCREEN 1. SCREEN 1 has only one character set, so you really could not utilize fullscreen even if you wanted to. SCREEN 1 is ok for some special pattern based effects, but the one colour per row feature of SCREEN 2 is really such a luxury that from now on you can always pressume I’m talking about SCREEN 2 unless I state otherwise.

     So SCREEN 2 is really fullscreen with every single pixel on the screen changable as desired. You can’t change every colour because of the ‘2 colours per 8 pixels’-restriction, but you can still change every pattern. Now our entire screen is 256×192-pixels which equals 32×24 characters. A bit of calculating reveals that our 256 characters a character set only fills one third of the screen. And as I stated earlier, this is exactly the case with SCREEN 1. But for SCREEN 2 to fill the entire screen you actually have three character sets independent of each other. Let’s call these character sets charset 1, charset 2 and charset 3. charset 1 always takes up rows 0..63 on screen. charset 2 is rows 64..127. charset 3 is rows 128..191. As for the name table (32 characters x 24 characters), charset 1 is the first 8 rows, charset 2 is rows 8..15 and charset 3 rows 16..23.

     A method like this is needed because our name table is only one byte accurate and there are 768 characters needed for the entire screen. So a write in the name table in rows 0..7 always picks a character from charset 1. If you write something in the rows 8..15 it picks a character from charset 2 and the same way for rows 16..23 and charset 3. For instance character ‘A’ on name table position (16,22) will always be the 66th character of charset 3. The character ‘A’ on the position (16,4) on the other hand will always be 66th character of charset 1. Charsets are divided into patterns and colours just as earlier, so that charset 1 is pattern table and colour table from 0..2047, charset 2 2048..4095 and charset 3 4096..6144.

     You must realize, that as a default, these charsets have nothing to do with each other. So the characters ‘A’ on the above example need not to be the same characters. By twitching with a few vdp-registers you can actually alter the number of charsets to three, two or one, a feature no-one ever bothered to document, but which is sometimes very useful. So for instance, you could have the charset 1 copied over charset 3, so that the ‘A’s in above example would be the same characters. The most bizarre part about this vdp-feature is that the number of pattern tables actually needn’t be the same as the number of colour tables. So you could for example have three totally independent colour tables but just one pattern table for the entire screen. This can be used for some very fruitful and seemingly impossible demo-effects, but I’ll cover these more in depth in the future, when I write an article on the ‘charset duplication and deproduction’-vdp feature.

More & Faster
A bit on the numbers and speed once more. 2048 bytes per frame, that is if you want to output the stuff to screen 50 Hz. Now I’m not a puritan who requires all his effects to run that fast — yet — but since you can’t doublebuffer in the common case, it’s raster split time. The restraint on doublebuffering is mostly due to the fact, you can only set the colour table and pattern table base addresses on a 13-bit accuracy in SCREEN 2; see ‘charset duplication and deproduction’ for more on this subject in the future. As for fullscreen you wouldn’t have enough memory for doublebuffer anyhow, but the restraint effectively prevents you doing it for smaller windows too; way to go. However, I have discovered that doublebuffering actually is possible on the restrained case of using a constant pattern table and drawing the effects to the colour table and only using rows 64..191. Watch this space.

     Now someone could point out that we should go for SCREEN 1, which has enough memory for 3 pattern tables and 24 colour tables, and best of all, colour table and pattern table base-registers that actually act the way they’re supposed to. Well, that’s a good idea, but remember we only have 256 characters in SCREEN 1. Those 256 characters make 2048 outputs and you could output that much in a frame anyway.

     So as for now, we’re sticking it with the 2048 outputs. That’s not very much really. You could refresh one pattern table or colour table of one charset, or half of a pattern table and half of a colour table of a charset. Since one charset is only 256×64-pixels — one third of the whole screen —, we will have to get realistic and only refresh either the colour table or the pattern table. Hence the terms chunky based effects and pattern based effects. Chunky based effects have a constant pattern table and pattern based effects a constant colour table for the whole of the screen. In the next issue I’ll be going more in depth on how to actually code chunky and pattern based effects in practice, complete with some source code too.

     But yet, 256×64 is nobodys idea of full screen and that’s the most we can update, so what is these SCREEN 2-demos are using, magick? Oh no, 8-bits have one God-given gift on their side and that’s character based graphics. Duplicate, reduplicate, copy, replicate and unduplicate as much as you ever please, because that’s what fools the eye. Remember, character ‘A’ is always the 66th character — with appropriate colour and pattern — within the same charset. It’s basically the very same case as when you’re reading this text right now. You can write for instance ‘JGT’ in any part of the screen and it will still look the same. Maybe have the effect running in a 80×192-window and replicate it three times vertically. With just one register twitch you can reduce the number of charsets to one and have horizontally replicating effects too, sort of like a super SCREEN 1. If you check the SCREEN 2-effects coded this far closely, you’ll notice almost every single one uses some sort of replication; either that, or they’re really small. Or they’re impossible, using clever vdp-features, there are a few in the MSX1. Remember, 256×64 or areas of any size multiplying to 2048 updates is the biggest updateble unit on pattern based and chunky based effects. Look at the SCREEN 2-effects again and you’ll notice none of them actually updates an area larger than this ; apart from the impossible ones of course.

     Actually the super SCREEN 1 I mentioned earlier is a well known feature amongst the MSX-scene, implemented by first setting SCREEN 1 is via bios and then setting the SCREEN 2-modebit on vdp-registers. This, however, is nothing more than one single implementation of ‘..deproduction’-feature. The register-values just happen to be right by accident for one charset in SCREEN 2; there’s a lot more than that in ‘..deproduction’.

Let’s talk physical
I won’t be giving out any effects or code yet, since we’re still moving in a pretty basic introductionary level for now. For now you should familiarise yourself with the very basic I/O-stuff, like setting up the vdp-address, writing to screen, changing the screenmode and stuff like that. PORTAR.TXT (see reference [1]) is one good reference for that kind of stuff. If you already know these things, stay tuned for the next issue where I’ll be giving away two effects: the interference rings and the twisting bars both representing a pattern based effect.

Aside from pattern based and chunky based effects there’s also a third branch of hires-effects, the character based effects. Think about it, the name table is only 768 long for the whole of screen. Now why not draw the effect in the name table and use fixed characters with fixed patterns and colours? Now what kind of effects would that make possible? Just don’t make any noise about it, it’s a secret, although by checking a certain classic gamecartridge by Konami you’ll notice it’s already been invented 15 years ago.

Wammes’ kolom
MSX Computer & Club Webmagazine
nummer 91, januari/februari 2000