It's bad to be a perfectionist when you're trying to make a living making software, except when your goal is to make a very strong Go program. Because I don't think cutting corners is going to work.
While I was implementing the n-th order liberty functionality, I did of course a lot of literature research. GNU Go does not expand them into (first order) enemy liberties, so I implemented that as well because it makes sense from a Go programming POV. But then I came to the conclusion that cheap tricks are not good enough.
So I decided to do a "Bouzy territory" approximation, where there are force fields between chains. So a black chains' 4th order liberty can't expand into a white chains' 3rd order liberty. This makes sense from a Go programming POV, because when both chains are alive, white is able to prevent black from expanding onto that point. And this knowledge is very useful to have in a tactical search. The expense of maintaining it may well be outweighed by greatly increased pruning efficiency and a better sorting of plausible moves.
This would be easy to make, if it weren't for severe (L1) space and (execution) time constraints. I can't have large lookup tables and linked lists all over the place. Everything, including the lookhead data, has to fit into 64 KB of L1 data cache. So I have to make do with my existing data structures to make a Bouzy-type map, and still make it very fast. This kind of stuff is pretty slow even with the most efficient data structures, let alone when you need to keep track of the number of order-libs per chain and do this from sub-optimal data. (Sub-optimal for that particular task, not for the system as a whole).
It doesn't help that the Opteron (AMD64) needs 9 cycles for a BSF (Bit Scan forward, or better: Barrel Shifter Forward). I guess I have to "Roeien met de riemen die ik heb". (A Dutch expression, literally: "Row with the oars I have").
Update: Blogging really helps! By expressing my thoughts here and re-reading them - trying to attack their premisses - I came to the conclusion that maintaining a Bouzy map is in fact totally superfluous, because I can achieve the same result by cleverly processing the data I already am maintaining. Your mileage may vary in your own Go program, however.
About the development of Moyo Go Studio, software to (help) play the Oriental game of Go. Go is a two-player zero-sum game of perfect information. It is considered much harder than Chess. Currently, in spite of enormous effort expended, no computer program plays it above the level of a beginner.
Saturday, June 24, 2006
Wednesday, June 21, 2006
Bugfixing the State Machine
Some say the average # of LOC/day is 20 for "Joe average", and I think that's valid for me as well. That includes bugfixing, maintenance, design, testing, surfing the web for conspiracy theories and a general lack of concentration due to neuroborreliosis.
I've been at it (The Big Bug Hunt) for a day now, but I've spent less than an hour at actual bugfixing. Part of the day was spent in bed in a collapsed stupor, I started trembling in the kitchen and had to lie down.
I've found half a dozen of bugs already. One was me trying to clear a bit using XOR, but forgetting that this bit isn't neccessarily set in the first place, so I had to do an AND with the NOT of the mask I used with the XOR. Another bug was caused by a stray stack pointer. A few bugs were discovered in pointer arithmetic here & there.
I got as far as verifying that a single stone, put on Tengen, gets its neighbor lists and liberty count updated, that the neighbor of that stone (an empty chain with the size of the board) gets its size reduced, and that the n-th order liberty fill code works at least partially correct.
I still haven't yet tested merging of chains, capturing chains or the delta-updating of n-th order liberties. I'm very slow with coding & debugging nowadays, often I go lie on the bed to conjure up a hypothesis before I return to my desk to test it. I usually don't debug without a testable hypothesis.
Chess programmers (and to a lesser extent Go programmers) often report serious bugs in their engine code that surfaced only years after they had a successful program, and I want to avoid that. In other words, I'm going to test the living daylights out of this code, using a test file with Go problems and their known solutions. But that's something for the future - first the TsumeGo code itself needs to be developed.
Since I don't play Go and I am an opponent of trying to be "clever" (= putting the programmer's Go knowledge in a Go program), I'm going to use automatic learning for the TsumeGo's plausible move generator and its ordering heuristics. This is near-trivial compared to writing the state machine itself, at least for me it is because I have several years of experience with automatic learning applied to Go. Not just with "pattern urgency", I also have done successful work with connectivity patterns.
So, where other folks have specialized ladder code, net code and liberty-filling code, I'll have nothing of the sort. Later I'll might code a fast ladder module as a special case but first I will design and build a plausible move generator/sorter that finds ladder moves, tries them first and follows them as deep as needed, all automatically. I will have to program nothing for that. It will be a very simple, elegant, small and fast self-learning expert system. All basic data structures for it are already in place and are updated as part of the state machine.
I have to admit it's a bit scary. David Fotland has efficient code that moves on liberties of threatened chains, and of course this is very fast. Faster than my expert system. But Fotland's code has no, or at least no solid notion about n-th order liberties and hence has difficulties with nets. In fact the big weakness of most, if not all Go programs today is the lack of tactical insight in terms of getting too dangerously enclosed. My design caters for early static detection of enclosure in the evaluation function, but my plausible move function will generate moves that try to enclose using-, resp. try to escape from nets of 2nd or higher order all by itself, using automatic learning. If this works as it should, the result is an extremely strong tactical module that can solve "open" (non-enclosed) TsumeGo problems in acceptable time. If this does not work as it should, it has to be fixed :)
You understand that my approach is highly superior to any 4d or 5d or even pro putting hundreds of if's in his plausible move generator. My approach takes half a million games and learns which tactical moves are plausible in which tactical situations. Using an expert system is slower than using if's, but the sophistication of pruning is much better, so I can achieve the Holy Grail of Go programming, a very narrow search tree. My plausible move generator for the Tactics module, contrary to Moyo Go's pattern expert, will not use actual patterns in the form of Zobrist hashes because this is unacceptable for reasons of accuracy and speed.
Neither will I use a neural network because I am of the old school - I avoid multiplications like the plague. I have no problems with black boxes, but only when I can pry them open and have a peek in them and understand or even influence what's going on in there.
I've been at it (The Big Bug Hunt) for a day now, but I've spent less than an hour at actual bugfixing. Part of the day was spent in bed in a collapsed stupor, I started trembling in the kitchen and had to lie down.
I've found half a dozen of bugs already. One was me trying to clear a bit using XOR, but forgetting that this bit isn't neccessarily set in the first place, so I had to do an AND with the NOT of the mask I used with the XOR. Another bug was caused by a stray stack pointer. A few bugs were discovered in pointer arithmetic here & there.
I got as far as verifying that a single stone, put on Tengen, gets its neighbor lists and liberty count updated, that the neighbor of that stone (an empty chain with the size of the board) gets its size reduced, and that the n-th order liberty fill code works at least partially correct.
I still haven't yet tested merging of chains, capturing chains or the delta-updating of n-th order liberties. I'm very slow with coding & debugging nowadays, often I go lie on the bed to conjure up a hypothesis before I return to my desk to test it. I usually don't debug without a testable hypothesis.
Chess programmers (and to a lesser extent Go programmers) often report serious bugs in their engine code that surfaced only years after they had a successful program, and I want to avoid that. In other words, I'm going to test the living daylights out of this code, using a test file with Go problems and their known solutions. But that's something for the future - first the TsumeGo code itself needs to be developed.
Since I don't play Go and I am an opponent of trying to be "clever" (= putting the programmer's Go knowledge in a Go program), I'm going to use automatic learning for the TsumeGo's plausible move generator and its ordering heuristics. This is near-trivial compared to writing the state machine itself, at least for me it is because I have several years of experience with automatic learning applied to Go. Not just with "pattern urgency", I also have done successful work with connectivity patterns.
So, where other folks have specialized ladder code, net code and liberty-filling code, I'll have nothing of the sort. Later I'll might code a fast ladder module as a special case but first I will design and build a plausible move generator/sorter that finds ladder moves, tries them first and follows them as deep as needed, all automatically. I will have to program nothing for that. It will be a very simple, elegant, small and fast self-learning expert system. All basic data structures for it are already in place and are updated as part of the state machine.
I have to admit it's a bit scary. David Fotland has efficient code that moves on liberties of threatened chains, and of course this is very fast. Faster than my expert system. But Fotland's code has no, or at least no solid notion about n-th order liberties and hence has difficulties with nets. In fact the big weakness of most, if not all Go programs today is the lack of tactical insight in terms of getting too dangerously enclosed. My design caters for early static detection of enclosure in the evaluation function, but my plausible move function will generate moves that try to enclose using-, resp. try to escape from nets of 2nd or higher order all by itself, using automatic learning. If this works as it should, the result is an extremely strong tactical module that can solve "open" (non-enclosed) TsumeGo problems in acceptable time. If this does not work as it should, it has to be fixed :)
You understand that my approach is highly superior to any 4d or 5d or even pro putting hundreds of if's in his plausible move generator. My approach takes half a million games and learns which tactical moves are plausible in which tactical situations. Using an expert system is slower than using if's, but the sophistication of pruning is much better, so I can achieve the Holy Grail of Go programming, a very narrow search tree. My plausible move generator for the Tactics module, contrary to Moyo Go's pattern expert, will not use actual patterns in the form of Zobrist hashes because this is unacceptable for reasons of accuracy and speed.
Neither will I use a neural network because I am of the old school - I avoid multiplications like the plague. I have no problems with black boxes, but only when I can pry them open and have a peek in them and understand or even influence what's going on in there.
Tuesday, June 20, 2006
Board Coordinate System
I like to have a paper version of the internal board coordinate system, to make debugging easier.
My first grid was 64x64, to make it fastest to calculate a neighbor point. So that grid had huge borders around the goban.
The new grid has no border at all. The coordinates are simply numbered from 0 to 360.
My first grid was 64x64, to make it fastest to calculate a neighbor point. So that grid had huge borders around the goban.
The new grid has no border at all. The coordinates are simply numbered from 0 to 360.
The Final Countdown
I succeeded in making an obscenely fast nth-order liberty delta-updater, and soon it's time to make what Michael Reiss' debug version of Go++ has: A plethora of debug windows.
Mick was so kind to send me his highly intimidating debug version of Go++, perhaps as a subtle form of psychological warfare amongst Go programmers ;-). Programming Go is one of the most complex endeavors on the globe, comparing to neurosurgery as neurosurgery compares to pruning roses. Pruning roses properly requires talent, skill and years of experience and so does neurosurgery - on a higher level, but the level of knowledge, talent and skill required for any kind of surgery pales next to what's needed to design and implement a state-of-the art Go program. So far, the only Western folks who succeeded were dan-level players or close and:
1. A designer of microprocessors (Many Faces of Go)
2. A former Microsoft programming wizard (SmartGo)
3. A dual PhD in Physics and Neural networks (Go++)
4. Another top expert in neural networks, working for a government (WinHonte)
5. A team of various expert developers with even a comp. sci professor in their ranks (GNU Go)
I have no education or Go ranking to boast of, so it's going to be interesting.
"The Final Countdown" refers to the firing up of the move engine. It's fueling up, to use a metaphor. As I said, I need to code up quite a lot of introspection stuff, because I'm not under the illusion that the beast will work as it should fresh from the press. And even if it does, I want to be able to verify this reliably and quickly, and be able to see where things go wrong.
So I'll need test code and GUI code that dumps the internal state. Although the state machine is coded with insane efficiency in mind (perhaps on a par with Vincent Diepeveen's branchless move generator for Chess), I expect to do a lot of optimization and I need to know immediately when I broke something.
I think that coding the debug- & test framework will take up a couple of weeks at most - I'm in the process of slowly moving domicilie - and debugging can take anything up to several months, depending on how many fundamental flaws in my design assumptions surface. I waited quite a few years before I started with the tactical module because haste makes waste.
Browsing the comp. Go mailing list, I was surprised to find a few colleagues mentioning how their tactics module took more than a second to generate and evaluate a position. I am aiming for at least 50,000 nodes/s. on my hardware (that's just a very rough estimate of what I think should be possible). And not just any nodes but highly relevant ones. Pruned in a clever way.
I won't use a hand-coded expert-system to prune the tree, I'll let automatic learning decide which points are interesting to play on. I have already done that very successfully with my best selling Moyo Go "pattern expert" software, but the plausible move generator I am making now will be specialized for tactical situations.
Mick was so kind to send me his highly intimidating debug version of Go++, perhaps as a subtle form of psychological warfare amongst Go programmers ;-)
1. A designer of microprocessors (Many Faces of Go)
2. A former Microsoft programming wizard (SmartGo)
3. A dual PhD in Physics and Neural networks (Go++)
4. Another top expert in neural networks, working for a government (WinHonte)
5. A team of various expert developers with even a comp. sci professor in their ranks (GNU Go)
I have no education or Go ranking to boast of, so it's going to be interesting.
"The Final Countdown" refers to the firing up of the move engine. It's fueling up, to use a metaphor. As I said, I need to code up quite a lot of introspection stuff, because I'm not under the illusion that the beast will work as it should fresh from the press. And even if it does, I want to be able to verify this reliably and quickly, and be able to see where things go wrong.
So I'll need test code and GUI code that dumps the internal state. Although the state machine is coded with insane efficiency in mind (perhaps on a par with Vincent Diepeveen's branchless move generator for Chess), I expect to do a lot of optimization and I need to know immediately when I broke something.
I think that coding the debug- & test framework will take up a couple of weeks at most - I'm in the process of slowly moving domicilie - and debugging can take anything up to several months, depending on how many fundamental flaws in my design assumptions surface. I waited quite a few years before I started with the tactical module because haste makes waste.
Browsing the comp. Go mailing list, I was surprised to find a few colleagues mentioning how their tactics module took more than a second to generate and evaluate a position. I am aiming for at least 50,000 nodes/s. on my hardware (that's just a very rough estimate of what I think should be possible). And not just any nodes but highly relevant ones. Pruned in a clever way.
I won't use a hand-coded expert-system to prune the tree, I'll let automatic learning decide which points are interesting to play on. I have already done that very successfully with my best selling Moyo Go "pattern expert" software, but the plausible move generator I am making now will be specialized for tactical situations.
Sunday, June 18, 2006
Engineers vs. Mathematicians
One of the key components of a strong tactical module is the part that maintains a count of the n-th order liberties. My design however needs more than that: It needs to know, for each point, which chains have what order liberties there. All this has to be updated with lightning speed (preferably faster) when stones are placed and chains are merged or captured. Simply recalculating everything would be too expensive.
Another bone of contention is the maximum order number of liberty that we should track. The higher it is, the slower the lookahead, and lookahead compensates for having a low max. order number. I've decided to dedicate no more than 20% of the L1 cache for liberty-order administration, which yielded a max. liberty order number of 4. I think this is the optimum value for 64-bit hardware when the algorithms are optimized for this architecture.
I am now starting to code the n-th order liberty delta-update module, after having spent days struggling with getting a grasp on the most efficient way to do it, and deciding which special cases could safely be ignored. Chess programs are surprisingly simple state machines with efficient info-extractors operating on them, and many tricks are used to cut corners, as long as it works. I am doing the same thing with my Go program. My design for fast delta-updating of n-th order liberties is faulty in case chains of a certain shape are merged. But this is a rare case, and its consequences are relatively minor.
In practice, the system will work like a charm. This approach is an advantage to that of the Mathematician. Mathematicians are looking for accurate ways to describe systems. Engineers are simply trying to build things that do what they're supposed to do.
Another bone of contention is the maximum order number of liberty that we should track. The higher it is, the slower the lookahead, and lookahead compensates for having a low max. order number. I've decided to dedicate no more than 20% of the L1 cache for liberty-order administration, which yielded a max. liberty order number of 4. I think this is the optimum value for 64-bit hardware when the algorithms are optimized for this architecture.
I am now starting to code the n-th order liberty delta-update module, after having spent days struggling with getting a grasp on the most efficient way to do it, and deciding which special cases could safely be ignored. Chess programs are surprisingly simple state machines with efficient info-extractors operating on them, and many tricks are used to cut corners, as long as it works. I am doing the same thing with my Go program. My design for fast delta-updating of n-th order liberties is faulty in case chains of a certain shape are merged. But this is a rare case, and its consequences are relatively minor.
In practice, the system will work like a charm. This approach is an advantage to that of the Mathematician. Mathematicians are looking for accurate ways to describe systems. Engineers are simply trying to build things that do what they're supposed to do.
Friday, June 16, 2006
Moving to the Middle of Nowhere
This is an aerial photo of the new tiny place where my girlfriend Martina and I will live together, two weeks from now. This is final - I have signed the contract and paid the rent for July, yee-haw! Hail to the hefty Norwegian tax returns :-)
Armenia will have to wait a decade or so - until they have ADSL. I can't run my business from a dodgy dialup line :-/
Imagine how easy it will be to concentrate there.. It's in a dead-end side part of a dead-end street, essentially in the middle of the forest, just what I always wanted.
Update: That Båntjern is a real gem - we swam in it a few times already.
Armenia will have to wait a decade or so - until they have ADSL. I can't run my business from a dodgy dialup line :-/
Imagine how easy it will be to concentrate there.. It's in a dead-end side part of a dead-end street, essentially in the middle of the forest, just what I always wanted.
Update: That Båntjern is a real gem - we swam in it a few times already.
Wednesday, June 14, 2006
Quick Merging Coordinate Lists
I've been struggling with a fast way to merge coordinate lists that have duplicates between them.
I now have a way that takes literally only a handful of clock cycles per coordinate, on average. As usual I'm not going to tell you how it works, but what's at least as interesting - and useful for you - is how I arrived at the algorithm.
This afternoon I had been dozing off and just as I entered that twilight zone of pre-REM sleep, it just popped into my head. Like my own voice telling me in a short and simple sentence how to merge coordinate lists and filtering duplicates. It's very cool. It does not involve checking in the result list whether a coordinate already occurs in it. That's why it's so fast :)
I am amazed at the human brain. Because I haven't conciously been pondering the problem. I originally coded a solution that took about 15 cycles per coordinate and all I did was occasionally wondering: "Shouldn't there be a faster way?". My subconscious gave me the answer as soon as I gave it a chance.
I now have a way that takes literally only a handful of clock cycles per coordinate, on average. As usual I'm not going to tell you how it works, but what's at least as interesting - and useful for you - is how I arrived at the algorithm.
This afternoon I had been dozing off and just as I entered that twilight zone of pre-REM sleep, it just popped into my head. Like my own voice telling me in a short and simple sentence how to merge coordinate lists and filtering duplicates. It's very cool. It does not involve checking in the result list whether a coordinate already occurs in it. That's why it's so fast :)
I am amazed at the human brain. Because I haven't conciously been pondering the problem. I originally coded a solution that took about 15 cycles per coordinate and all I did was occasionally wondering: "Shouldn't there be a faster way?". My subconscious gave me the answer as soon as I gave it a chance.
Sunday, June 11, 2006
Delta-updating n-th order Liberties
I decided I need to administrate a much higher order of liberties* (5 seems a good start), to be able to detect that a chain has no room to escape.
I found a way to do this extremely quickly, and get payback for the expense in clockcycles in the form of a much more agressive & accurate tree pruning/plausible move ordering.
The method is very exotic and involves neither bitboards nor coordinate lists. It has most definitely never been done before, otherwise there would already been dan-level Go programs :)
It goes too far to explain it all, the system can only work when other, similarly complex systems are in place but I am confident it will work. The most exciting aspect is that I think that there'll be a working tactical module "technology preview" this year.
Some folks still seem to think I'm making a "pattern-based" Go program, well, I am doing no such thing. I can't help it that my Pattern library (for Joseki & Fuseki etc.) was such a huge success that many now think that somehow that is going to be the "core" of my Go engine. Wrong. The pattern expert has been made to do Fuseki, Joseki & Good Shape, never anything else. Not TsumeGo! Not connectivity! Some egomaniacs/ignoramusses like to tell me that it's "impossible" to write a strong Go program, or that a bad Go player can't write a strong Go program.
Their asses will be used to wipe the floor with :)
*The 5th order liberties of a chain is the wavefront of a floodfill of liberties of liberties of liberties of liberties of liberties.
I found a way to do this extremely quickly, and get payback for the expense in clockcycles in the form of a much more agressive & accurate tree pruning/plausible move ordering.
The method is very exotic and involves neither bitboards nor coordinate lists. It has most definitely never been done before, otherwise there would already been dan-level Go programs :)
It goes too far to explain it all, the system can only work when other, similarly complex systems are in place but I am confident it will work. The most exciting aspect is that I think that there'll be a working tactical module "technology preview" this year.
Some folks still seem to think I'm making a "pattern-based" Go program, well, I am doing no such thing. I can't help it that my Pattern library (for Joseki & Fuseki etc.) was such a huge success that many now think that somehow that is going to be the "core" of my Go engine. Wrong. The pattern expert has been made to do Fuseki, Joseki & Good Shape, never anything else. Not TsumeGo! Not connectivity! Some egomaniacs/ignoramusses like to tell me that it's "impossible" to write a strong Go program, or that a bad Go player can't write a strong Go program.
Their asses will be used to wipe the floor with :)
*The 5th order liberties of a chain is the wavefront of a floodfill of liberties of liberties of liberties of liberties of liberties.
Saturday, June 10, 2006
Eureka!
In order to get the really good ideas I have to get away from the computer. So I went to this lake and thought.
Contrary to the Internet, this lake does not have evil enemies that want to destroy me.
The "seed" of the idea appeared while I was walking to it (about five clicks uphill from a suburbial subway station). The analysis was done at the lake. Realizing the whole thing was a lousy idea came on the way back, including the alternative idea.
The alternative idea is Big. Really Big. Can't talk about it, sorry. Nice lake, isn't it?
Contrary to the Internet, this lake does not have evil enemies that want to destroy me.
The "seed" of the idea appeared while I was walking to it (about five clicks uphill from a suburbial subway station). The analysis was done at the lake. Realizing the whole thing was a lousy idea came on the way back, including the alternative idea.
The alternative idea is Big. Really Big. Can't talk about it, sorry. Nice lake, isn't it?
2nd-order Liberties
I realized I'm not done yet. Many tactical problems involve moving on second order liberties. I will make a tactical pattern recognizer based on Moyo Go's existing pattern system (with smaller patterns but therefore a thousand times faster and more sophisticated). That module needs to know about 2nd order liberties anyway, so I might just as well maintain them in the tactical engine.
The tactical pattern classifier will order plausible moves based on automated learning from a billion moves in pro- and high-ranking ama games. This neatly avoids me having to learn & encode hundreds or even thousands of patterns and heuristics, and it's a much superior method anyway. It will be easy to build such a system: I've done it before and I already know it works very well :)
I will camp in the forest this weekend and design both: A fast incremental second-order liberty updater and a tactical pattern expert suitable for a hyperfast plausible move generator.
The PMG's hash table should fit snugly in the Opteron's L2 cache, which is 1 MB.
The tactical pattern classifier will order plausible moves based on automated learning from a billion moves in pro- and high-ranking ama games. This neatly avoids me having to learn & encode hundreds or even thousands of patterns and heuristics, and it's a much superior method anyway. It will be easy to build such a system: I've done it before and I already know it works very well :)
I will camp in the forest this weekend and design both: A fast incremental second-order liberty updater and a tactical pattern expert suitable for a hyperfast plausible move generator.
The PMG's hash table should fit snugly in the Opteron's L2 cache, which is 1 MB.
Friday, June 09, 2006
Tactical Module Progress Report
The state machine, after three weeks of coding, is done. It looks outlandishly complex but I used indentation, empty lines, column-alignment, meaningful identifiers and comments to alleviate some of that.
It has around 1250 "real" lines, and Pascal is a little more verbose than C so It would be around 1000 LOC in C. But only when the same optimization hacks are used, otherwise it's around 600 lines.
All these weeks I have been coding, I haven't done a single test run. Neither will I, in the very near future. First I'll draw a large flowchart of what this code is doing, because it's quite complex and very easy to oversee details. I have a pile of A3 sheets for this purpose. It will take several attempts and sticky tape to create the flowchart (Nassi-Schneiderman diagrams suck).
After the flowchart has been studied and shown to be an accurate reflection of how the state machine should look like, I'll add visualisation code to show the state of the state machine, and a method of adding stones. In spite of Delphi's Windows/GUI capabilities, this has to be stdin/stdout stuff because still I need to be able to test the EXE when it has been compiled as a console app by Free Pascal, and when it's assembled by MASM, after I hand-optimized Free Pascal's 64-bit MASM output. Many tricks have been used to take advantage of modern CPU's 64-bit registers, but Delphi can compile the tactical module to run on 32-bit machines as well, by emulating 64-bit words in two 32-bit doublewords.
So from now on it's documenting, testing, bugfixing and benchmarking the core TsumeGo state machine - a major milestone has been reached - exciting times ahead!
I want to stress again how important a dedication to speed is, with this kind of code. I have learned a lot. 16-bit words need one more cycle to load than larger sizes. Bytes are very slow (I'm talking of the only serious 64-bit contenders out there - made by AMD). I've been carefully weighing the benefit of gaining a cycle by using lookup tables with the drawback of straining the cache. Realistic "stress code" will be used to experiment with alternative data representations and the algoritms acting upon them. Speed in this module is of prime importance so I will not rest until it can't go any faster.
The speed of state maintenance is very dependent on what you want to know about a tactical situation of course. It's a complex equilibrium of speed of delta-updating the state and the speed of extracting what you need to know from that state.
It has around 1250 "real" lines, and Pascal is a little more verbose than C so It would be around 1000 LOC in C. But only when the same optimization hacks are used, otherwise it's around 600 lines.
All these weeks I have been coding, I haven't done a single test run. Neither will I, in the very near future. First I'll draw a large flowchart of what this code is doing, because it's quite complex and very easy to oversee details. I have a pile of A3 sheets for this purpose. It will take several attempts and sticky tape to create the flowchart (Nassi-Schneiderman diagrams suck).
After the flowchart has been studied and shown to be an accurate reflection of how the state machine should look like, I'll add visualisation code to show the state of the state machine, and a method of adding stones. In spite of Delphi's Windows/GUI capabilities, this has to be stdin/stdout stuff because still I need to be able to test the EXE when it has been compiled as a console app by Free Pascal, and when it's assembled by MASM, after I hand-optimized Free Pascal's 64-bit MASM output. Many tricks have been used to take advantage of modern CPU's 64-bit registers, but Delphi can compile the tactical module to run on 32-bit machines as well, by emulating 64-bit words in two 32-bit doublewords.
So from now on it's documenting, testing, bugfixing and benchmarking the core TsumeGo state machine - a major milestone has been reached - exciting times ahead!
I want to stress again how important a dedication to speed is, with this kind of code. I have learned a lot. 16-bit words need one more cycle to load than larger sizes. Bytes are very slow (I'm talking of the only serious 64-bit contenders out there - made by AMD). I've been carefully weighing the benefit of gaining a cycle by using lookup tables with the drawback of straining the cache. Realistic "stress code" will be used to experiment with alternative data representations and the algoritms acting upon them. Speed in this module is of prime importance so I will not rest until it can't go any faster.
The speed of state maintenance is very dependent on what you want to know about a tactical situation of course. It's a complex equilibrium of speed of delta-updating the state and the speed of extracting what you need to know from that state.
Hacker's Delight - Delightful Indeed
That book - Hacker's Delight - is everything I hoped it would be. This is what I got out of it:
1. I need only 16 cycles + 1 if to know whether a certain chain is adjacent to a board point. Normally this would take 4 ifs, but a non-taken branch costs at least 10 cycles. My code is at least twice as fast, on average.
2. I found a perhaps novel, if-less way to count the number of trailing 0's in a 64-bit register, five times faster than right-shifting and examining the LSB. The book still used four comparisons to do that. The code that shifts, compares and increments a point coordinate takes five cycles per trailing zero, so the average with sparsely populated registers (a 1 on position 31) is 150 cycles. My code takes 30 cycles for any number of trailing zero's. To arrive at this solution, I modified and combined two techniques described in the book and adapted them to 64-bits. This code efficiently converts bitmaps to coordinate lists.
I am starting to get convinced that MoyoGo's tactical module will be the absolute fastest. I maintain data structures that allow me to determine life/death status in near-theoretical optimum time, when running the most efficient, recently published algorithms.
So the design & implementation process of the TsumeGo module went like this:
A. Find the fastest method of determining L&D in Go literature.
B. Find the data structures one needs to maintain in order to do (A).
C. Implement a state machine that incrementally maintains (B).
D. Use every trick in the book to speed up (C).
There will be (E), namely finding and implementing a set of heuristics to prune the search tree. I already have a move-suggester in the form of a superfast, Zobrist-hash based pattern classification engine, and Go literature will yield the rest. But A, B, C and D are quite independent of E. As long as certain questions can be quickly answered about the state of the state machine, E can be efficiently implemented. I expended a lot of effort reading relevant Go programming literature and studying books on life & death to ensure that compex queries about the state can be executed in minimal time, so I am confident about (E) as well.
By making this code as fast as I can, I am putting distance between me and the competition. It's like putting a 20 GHz CPU in my customer's PC that can only be used to run Moyo Go. Chess programmers spend their prime shaving off cycles, for Go this is as least as important because Go needs a much faster computer to be able to play well. Saying that computers are too slow to play well and that therefore speed is irrelevant is defeatism.
1. I need only 16 cycles + 1 if to know whether a certain chain is adjacent to a board point. Normally this would take 4 ifs, but a non-taken branch costs at least 10 cycles. My code is at least twice as fast, on average.
2. I found a perhaps novel, if-less way to count the number of trailing 0's in a 64-bit register, five times faster than right-shifting and examining the LSB. The book still used four comparisons to do that. The code that shifts, compares and increments a point coordinate takes five cycles per trailing zero, so the average with sparsely populated registers (a 1 on position 31) is 150 cycles. My code takes 30 cycles for any number of trailing zero's. To arrive at this solution, I modified and combined two techniques described in the book and adapted them to 64-bits. This code efficiently converts bitmaps to coordinate lists.
I am starting to get convinced that MoyoGo's tactical module will be the absolute fastest. I maintain data structures that allow me to determine life/death status in near-theoretical optimum time, when running the most efficient, recently published algorithms.
So the design & implementation process of the TsumeGo module went like this:
A. Find the fastest method of determining L&D in Go literature.
B. Find the data structures one needs to maintain in order to do (A).
C. Implement a state machine that incrementally maintains (B).
D. Use every trick in the book to speed up (C).
There will be (E), namely finding and implementing a set of heuristics to prune the search tree. I already have a move-suggester in the form of a superfast, Zobrist-hash based pattern classification engine, and Go literature will yield the rest. But A, B, C and D are quite independent of E. As long as certain questions can be quickly answered about the state of the state machine, E can be efficiently implemented. I expended a lot of effort reading relevant Go programming literature and studying books on life & death to ensure that compex queries about the state can be executed in minimal time, so I am confident about (E) as well.
By making this code as fast as I can, I am putting distance between me and the competition. It's like putting a 20 GHz CPU in my customer's PC that can only be used to run Moyo Go. Chess programmers spend their prime shaving off cycles, for Go this is as least as important because Go needs a much faster computer to be able to play well. Saying that computers are too slow to play well and that therefore speed is irrelevant is defeatism.
Thursday, June 08, 2006
SmartGo FUD Countermeasure
I'm constantly amazed at how low people can stoop when they find themselves like a rat in a corner.
Take SmartGo. Moyo Go Studio, rising star, leading as it is in Go software R&D and the most innovative Go software on the market, implemented a "Kombilo" pattern search not so long ago. MoyoGo was the first, and managed to best Kombilo's search speed and -accuracy. Search speed partially bested due to multi-threadedness and accuracy bested due to not using a hashing scheme.
As usual, Anders Kierulf of SmartGo dropped whatever he was doing and tried implementing the same feature. He succeeded, but without the powerful analysis graphics and browsable example diagrams that MoyoGo has. The greatest flaw in SmartGo's free-pattern search however is its slowness. True to Microsoftian traditions (Kierulf used to work for them), he ressorts to FUD instead of going back to the drawing board: On his site he claims that Moyo Go's search feature sucks. He tried to cast in into a non-libellous, seemingly "objective" form, but it's clear whom he is targetting because there is only one commercial competitor to SmartGo that offers this feature, and it's made by yours truly.
I could simply feel pity for Anders that he is obviously unable to code something that can compete with MoyoGo's search functionality, but I don't. I don't because spreading fear, uncertainty and doubt about a competitor simply is a rather lowlife approach to being a software developer. I found a simple countermeasure to his baloney: I copied his FUD story, changed a few words and put it on my site, targeting him. Only, I was not afraid to name SmartGo by name, because the perfect defence in a libel suit is the truth :)
Take SmartGo. Moyo Go Studio, rising star, leading as it is in Go software R&D and the most innovative Go software on the market, implemented a "Kombilo" pattern search not so long ago. MoyoGo was the first, and managed to best Kombilo's search speed and -accuracy. Search speed partially bested due to multi-threadedness and accuracy bested due to not using a hashing scheme.
As usual, Anders Kierulf of SmartGo dropped whatever he was doing and tried implementing the same feature. He succeeded, but without the powerful analysis graphics and browsable example diagrams that MoyoGo has. The greatest flaw in SmartGo's free-pattern search however is its slowness. True to Microsoftian traditions (Kierulf used to work for them), he ressorts to FUD instead of going back to the drawing board: On his site he claims that Moyo Go's search feature sucks. He tried to cast in into a non-libellous, seemingly "objective" form, but it's clear whom he is targetting because there is only one commercial competitor to SmartGo that offers this feature, and it's made by yours truly.
I could simply feel pity for Anders that he is obviously unable to code something that can compete with MoyoGo's search functionality, but I don't. I don't because spreading fear, uncertainty and doubt about a competitor simply is a rather lowlife approach to being a software developer. I found a simple countermeasure to his baloney: I copied his FUD story, changed a few words and put it on my site, targeting him. Only, I was not afraid to name SmartGo by name, because the perfect defence in a libel suit is the truth :)
Mafia Threats: MS/VeriSign "Protection" Racket
VeriSign is the domain registrar for .com & .net (and have tried to (illegally?) squeeze as much money out of that as they could), but they are also the main culprits in a sinister ploy: That of the "Designed for Windows" program.
It's the classical mafia "protection" racket: We will protect you from us if you pay us money. If you do not pay us money, we will cause you serious trouble.*
And yes, Moyo Go Studio is feeling the heat from this scheme too. Microsoft and VeriSign have a deal: Developers must purchase an extortionately costly VeriSign number, and put this in their programs. If they don't, Microsoft will tell their users that - SECURITY ALERT! - your software is not "designed for Windows" and on top of that untrustworthy. And ask whether the user is really "willing to take the risk of data loss, identity theft and having a zombie child porn server on their system: [Yes (install scary software] [No (tell μISV to buy VeriSign number])". Like, if a VeriSign number somehow guarantees secure software. VeriSign is involved in a few other questionable practices as well.
One wonders where they get the guts - first they introduce a plethora of buffer overflow exploits, and when they've generated enough anger and indignation, they blame other people's software and demand cash from them. If not, you'll be deemed a "threat", and your users will be told not to trust your "non-compliant" software. Noncompliant as in not having paid your VeriSign bill. I do not know how much money VeriSign funnels to Microsoft.
The future looks bleak: Rumors are that Windows Vista makes it much harder to get away with not paying the hefty VeriSign fee. Windows Vista is reported to suck bigtime (and might contibute greatly to Microsoft's eventual, seemingly inevitable demise), so I recommend not bothering with it at all.
*A similar racket is operated by "professional organizations" for plumbers, electric engineering shops etc. For 1500 USD a year, a plumber buys the right to display a door sticker with "member of the guild of professional plumbers". Part of the proceeds go towards making consumers aware that without the sticker, the plumber is "untrustworthy". The rest of the loot is invested in a 170-foot yacht for the guy who thought of it first.
It's the classical mafia "protection" racket: We will protect you from us if you pay us money. If you do not pay us money, we will cause you serious trouble.*
And yes, Moyo Go Studio is feeling the heat from this scheme too. Microsoft and VeriSign have a deal: Developers must purchase an extortionately costly VeriSign number, and put this in their programs. If they don't, Microsoft will tell their users that - SECURITY ALERT! - your software is not "designed for Windows" and on top of that untrustworthy. And ask whether the user is really "willing to take the risk of data loss, identity theft and having a zombie child porn server on their system: [Yes (install scary software] [No (tell μISV to buy VeriSign number])". Like, if a VeriSign number somehow guarantees secure software. VeriSign is involved in a few other questionable practices as well.
One wonders where they get the guts - first they introduce a plethora of buffer overflow exploits, and when they've generated enough anger and indignation, they blame other people's software and demand cash from them. If not, you'll be deemed a "threat", and your users will be told not to trust your "non-compliant" software. Noncompliant as in not having paid your VeriSign bill. I do not know how much money VeriSign funnels to Microsoft.
The future looks bleak: Rumors are that Windows Vista makes it much harder to get away with not paying the hefty VeriSign fee. Windows Vista is reported to suck bigtime (and might contibute greatly to Microsoft's eventual, seemingly inevitable demise), so I recommend not bothering with it at all.
*A similar racket is operated by "professional organizations" for plumbers, electric engineering shops etc. For 1500 USD a year, a plumber buys the right to display a door sticker with "member of the guild of professional plumbers". Part of the proceeds go towards making consumers aware that without the sticker, the plumber is "untrustworthy". The rest of the loot is invested in a 170-foot yacht for the guy who thought of it first.
Wednesday, June 07, 2006
The First Law of Programmer Creativity
"The cost of software maintenance increases with the square of the programmer's creativity." First Law of Programmer Creativity, Robert D. Bliss, 1992.
This guy is spot-on. I am inundated with emails like: "It's totally cool to be able to do a one-click search on Sensei's Library - nobody has that - but could you make their entire site searchable off-line?"
and:
"It's totally cool that Moyo Go instantly shows important patterns - nobody has that - but can you add this horrendously complex feature so that it would be even cooler?"
and:
"It's totally cool that Moyo Go lets me record Audio comments - nobody has that - but can you make the program automatically jump around too?"
Etc. etc. I became hostage to my own creativity. I have to be firm and let Moyo Go gain notoriety first, while I work on the death blow for all dangerous competitors: MoyoGo's upcoming tactical expert. It will know about life & death, connectability and even Territory & Moyo because that's basically coded already - it just needs to know L&D and connectivity.
A TsumeGo module enhances the pattern expert as well - it won't suggest the occasional obviously bad move any more (due to out-of-pattern tactical issues).
This guy is spot-on. I am inundated with emails like: "It's totally cool to be able to do a one-click search on Sensei's Library - nobody has that - but could you make their entire site searchable off-line?"
and:
"It's totally cool that Moyo Go instantly shows important patterns - nobody has that - but can you
and:
"It's totally cool that Moyo Go lets me record Audio comments - nobody has that - but can you make the program automatically jump around too?"
Etc. etc. I became hostage to my own creativity. I have to be firm and let Moyo Go gain notoriety first, while I work on the death blow for all dangerous competitors: MoyoGo's upcoming tactical expert. It will know about life & death, connectability and even Territory & Moyo because that's basically coded already - it just needs to know L&D and connectivity.
A TsumeGo module enhances the pattern expert as well - it won't suggest the occasional obviously bad move any more (due to out-of-pattern tactical issues).
Hacker's Delight?
It arrived! Cool book, in spite of the fact that it wastes a lot of pages on proofs. I trust the author, I don't need no stinkin' proofs. I would have preferred twice as many bit twiddling tricks! And shouldn't the title be "Hackers' Delight"?
I need the quickest way to count trailing zero's, that sort of thing. As I said before, the NSA covertly prevented the ability to do any such important things (popcount as well) quickly on desktop machines. This book gives ideas and code how to do it a hundred times slower than a dedicated instruction would have done it*, but it's still very useful. *Don't forget that a mispredicted branch incurs a penalty of at least 10 cycles.
Due to MSB issues, the usability of some code is questionable to Pascal users, but for the tactical engine, I use Pascal only to generate MASM assembly anyway, and the ultimate goal is to code directly in 64-bit assembly.
It is insane not to write it in anything else but Assembly. People argue that the language used is less important than the efficiency of pruning heuristics and eval function, but they are assuming that I will write highly sub-optimal code. Even if I would do that, using assembly would outweigh that.
So me being an Assembly hacker might very well offset my currently limited Go knowledge as it will be applied to various heuristics, because the difference between compiler-optimized optimal 32-bit C code and human-optimized 64-bit assembly code is enormous - sometimes a factor 10 or more. Because using 64-bit registers allows for a radically different, highly efficient approach of some aspects, and coding at the metal is incomparably better than being at the mercy of the optimizer.
I need the quickest way to count trailing zero's, that sort of thing. As I said before, the NSA covertly prevented the ability to do any such important things (popcount as well) quickly on desktop machines. This book gives ideas and code how to do it a hundred times slower than a dedicated instruction would have done it*, but it's still very useful. *Don't forget that a mispredicted branch incurs a penalty of at least 10 cycles.
Due to MSB issues, the usability of some code is questionable to Pascal users, but for the tactical engine, I use Pascal only to generate MASM assembly anyway, and the ultimate goal is to code directly in 64-bit assembly.
It is insane not to write it in anything else but Assembly. People argue that the language used is less important than the efficiency of pruning heuristics and eval function, but they are assuming that I will write highly sub-optimal code. Even if I would do that, using assembly would outweigh that.
So me being an Assembly hacker might very well offset my currently limited Go knowledge as it will be applied to various heuristics, because the difference between compiler-optimized optimal 32-bit C code and human-optimized 64-bit assembly code is enormous - sometimes a factor 10 or more. Because using 64-bit registers allows for a radically different, highly efficient approach of some aspects, and coding at the metal is incomparably better than being at the mercy of the optimizer.
Cool Forum
I found a forum with people talking about Life, the Universe and The Rest, with heaps of chess programming techniques as a bonus. Since the computer Go mailing list is all about boring non-programming stuff these days, the Chess forum is all there is - but it's very useful anyway:
http://216.25.93.108/forum/
http://216.25.93.108/forum/
Tuesday, June 06, 2006
Short Story
I managed to salvage an old version of a SF story I wrote.
God?
The Search for Extraterrestrial Intelligence finally seemed to bear fruit: A consistent signal was received that came from an entire quadrant of the galaxy, superimposed on the background radiation in that area. This was remarkable, because the source of the signal was not a stellar object or even a location in the universe, but it was like it was uniformly distributed over an area of about one sixteenth of the outer sphere of the universe. Hard to miss, therefore. The signal hadn't been there before.
It didn't take long before they cracked the code, to be more precise, the NSA did it. The CIA took it from there. They put a couple of bright kids onto the puzzle. The code was simply a sequence of zero's and ones. They discovered that the number of binary digits was the square of a prime number. When the digits were arranged in a square so that a side was equal to that of that prime number, the resulting box of 0's and 1's formed an image. Kind of like a fax.
There were quite a number of those images, and they were repeated all the time. It took about a month for all the data to be received.
Together, they formed the schematics of a rather simple looking electronic device. The images were annotated with a kind of symbolic mathematical script.
It didn't take long before also that was deciphered as well. Some symbols pertained to atom numbers, indicating the composition of the materials. Some symbols were measurements expressed in wavelengths of Hydrogen.
To make a long story short: The US government, even the President himself, were consulted. The CIA wanted to build the device. Experts said that it could do no harm, it seemed to be some kind of simple receiver that apparently didn't even need a power source. Although it was unclear what
exactly it would receive.
They built it and as soon as they turned it on, it outputted a similar stream of binary digits, only much more than had been originally received by SETI, and in much higher resolution and with a much higher frequency. It was like they had hooked up a Hard Disk, transferring many megabytes of data in an endless loop.
Everybody thought it was a hoax. Investigations were launched. They got people from outside the team to find malversations. They put the receiver in a cage of Faraday, and it kept on receiving. They let another team assemble a similar device, which worked just as well and produced the same output, synchronously to the last bit, with he first device. They took it into an abandoned coal mine, and it kept working. Components were replaced with equivalent components and it kept working.
People were getting nervous. The output of this device looked like the blueprints of some kind of transceiver. It would take a while to build it. Some of the components were hard, possibly impossible to manufacture. One of the components was a diamond ball of around 2 cm in diameter. They would have needed to steal the Koh-I-Noor and polish it into a rather dull marble, in order to comply with those blueprints.
And that is exactly what they did. Synthetic diamonds of that size were impossible to manufacture. The team who did it, was immediately afterwards assasinated by a "special" team. And the assassins themselves disappeared shortly afterwards. The diamond was polished and the result was referred to as: "The marble". The engineers thought it was made of glass. For increased security, ten identical devices were assembled, nine of which used an ordinary quarz marble.
When the thing was finished, a small team of NSA mathematicians and linguists was assembled. All were in their mid-fourties. None of them was married. They were given the neccessary security clearance. They were informed about the need for total secrecy and the consequences of breaking secrecy. Curious as they were, all of them agreed. They would work in duo's, one mathematician, one linguist, taking turns on the thing that was presumed to be a communication device.
The machine, modest in size, was hooked up to the serial port of a computer. It only had one input or output (whatever it was), a copper rod, protruding from the base. The device itself was simple in construction, but the dimensions were extremely accurate and the materials were rather exotic. The spherical diamond was partly embedded into two blocks of Rubidium and Osmium alloy, joint at polished sides in the middle. The thing was like a crystal-earphone, only more solid and a tad more expensive.
The base was connected to the ground of the PC. It didn't say that in the instructions, but it seemed a good idea.
Only two mathematicians were allowed to do the initial testing. A NSA programmer had made a simple program with which mathematical symbols could be converted into zeros and ones, and put into the device. And whenever similar pulses would be received, those pulses would appear as their corresponding mathematical symbols on the screen. The "LaTeX" symbols the scientist were familiar with were translated in the deciphered symbols used in the original blueprints.
The most senior of the two was called McFarlow.
McFarlow started to type. Normally a touch-typist, he now used one finger at a time, slowly, hitting each key squarely on the head.
"This is my version of 'Who are you'", McFarlow whispered hoarsely, as he entered the symbols for:
"Outside This Collection Equals"
There was an almost unperceptible delay as the following appeared:
"Collection Sum Collections"
McFarlow seemed to hold his breath. In fact he inhaled very slowly through his nose, trying to remain calm.
He glanced around the periphery of the computer screen, as if to see if he wasn't tricked in some way.
This wasn't happening. If this was a joke, he wasn't sure he had the sense of humor to appreciate it.
He had just asked "Who are you", in mathematical equivalent symbolic notation. Or at least, that was what he intended to ask.
The answer could only be interpreted as: "I am somewhere that contains where you are" or, more dramatically "I include/encompass everything".
"This is impossible" he said.
"Whoever we are talking to, a signal can't go faster than light. The furthest anyone can be who's answering this, is the Moon."
"What about faster-than light, like gravitational waves?" Wayne Drury suggested.
"We don't know how this thing works".
McFarlow unfocused his eyes for a moment in thought and then typed:
"Outside This Collection Position Equals"
"What does that mean?".
"I try to ask where this comes from, where he, she or it is located".
The answer was immediate.
"Outside All Collections Largest Sum Collection".
Wayne said nothing.
McFarlow looked intently at the screen and didn't blink an eye.
"Time Min Max Equals" he entered.
Again, the answer appeared barely after his finger released the ENTER key.
"ONE True Inside Collection Zero Towards Positive Infinity TWO True Outside All Collections Negative Infinity Zero Positive Infinity"
Wayne looked at him and before he could ask, McFarlowe said with a hoarse voice: "Shhst! Let me look at this".
He frowned as he made a smudge trail with his finger as it moved, word for word, over the text on the screen.
After a while he said with a sigh: "I asked about when it came into existence, when it would cease to exist". I think it means this:
"1. For us here, it has existed since the beginning of time and will exist until the end of time."
"Funny how it doesn't use "Infinity minus one" but simply 'infinity'.."
"2. For itself, it claims it has always existed and will always exist."
Wayne, incredulously but highly exited exclaimed: "So, what you are saying is.. We're talking to God!"
"It appears so". McFarlow looked very serious. Solemn, almost.
"Jezus Christ.." Wayne said.
"I don't believe it until he proves it" McFarlow said.
"I just don't believe it".
"Ask it a question" Wayne suggested.
"Like what?"
"Well, is it almighty, how's it like in heaven, that kind of stuff.." Wayne replied. Wayne was a religious person.
"I'm not sure how to translate that into mathematical notation". McFarlow obviously was losing his patience but seemed intrigued at the same time.
He started to type.
"Unknown True: Element A Inside This Collection Towards Not Element A In This Collection Equals Element A Inside Collection Sum Collections".
Wayne, the linguist, wiggled his eyebrows. "I'm afraid you've lost me here".
McFarlow hadn't hit the ENTER key yet.
"We have to hurry", he said. "We don't know how long the connection will last".
"I am asking whether there is a heaven, more exactly, something after we cease to exist, after our universe ceases to exist, is there anywhere we will still be". He pressed ENTER.
Again, the answer was instantaneous.
"Not True".
"At least it seems to understand what I mean", said McFarlow.
Wayne was visibly disappointed . He tried to hide that fact that he was taking offence.
"Ask whether it is almighty".
"Do you have any idea on how to formulate that?"
"Something with the collections?" Wayne suggested. "Like, can he make our world, our 'collection', disappear?".
McFarlow didn't need much encouragement. He typed:
"Unknown True: Collection Sum Collections Statistical Probability
GreaterThan Minus Infinity Towards Not This Collection".
ENTER
"True".
"The answer is yes", McFarlow said.
"Oh".
"Are you sure?".
"I'm not even sure this isn't some sick joke".
"Ask it whether it knows everything, whether it is all-knowing".
"Unknown True: Collection Sum Collections Includes Equation Solved True Sum Actions This Collection"
"Not True"
"Hm", said Wayne.
"So he claims to be God and to be able to be all-powerful or at least able to destroy us, but there is no heaven and he does not know what is happening here?"
"Nobody was ever sure about the true nature of God" said McFarlow.
"It's obviously better to ask him directly".
"A pity he doesn't speak normal English" Wayne joked.
"Can't we send him some pictures, ask him what he thinks about starving African AIDS orphans, that kind of stuff?"
McFarlow looked at him.
"That's not such a bad idea at all. We could send it in the same format as the blueprints. Digitized pictures, black & white, as binary digit in squares with the same side length as the blueprints we got".
"OK then, let's do it", Wane said.
They Googled some pictures of starving children, victims of the Hiroshima bombing and a particularly gruesome image from the Holocaust. They scaled the images to the right size and converted them to monochrome.
From the network, they prepared to send the images to who- or whatever was on the other side.
"When were are really talking to God, I would like to see what he has to say about THIS" McFarlow mumbled. It would only take a second or so to send them all and they accompanied the images with the following question:
"Unknown: Sample Here Collection Towards Collection Sum Collections Unknown True False".
Both mathematicians realized there was no way of expressing an ethical judgement, of expressing "good" or "bad" in mathematical language. They needed a more subtle means of communication. But whoever was on the other side seemed to be able to make sense of their symbols, at least for now.
They proceeded with sending the images, which took only a second or so.
But there came no answer.
They tried a few more questions but there was no response. They thought they had perhaps overloaded the tranmitter.
Later that day, the engineers put the diamond marble into one of the nine spare tranceivers and McFarlow and Drury were allowed to try again. No result.
They handed over the records of communication, and they were reproached for having sent the images.
Another team took over, with NSA minders this time.
The line was dead.
McFarlow suggested offering an apology, which he creatively formulated in LaTeX:
"True True True: This Collection Element Equals Minus Infinity This Collection Element Equals Not True Not True Not True"
A tear ran over his cheek as he typed it. The only chance for a dying Earth to be elevated to a higher level - technologically as well as spiritually - forever spoilt by a reckless demonstration of the worst of humanity?
An answer never came.
God?
The Search for Extraterrestrial Intelligence finally seemed to bear fruit: A consistent signal was received that came from an entire quadrant of the galaxy, superimposed on the background radiation in that area. This was remarkable, because the source of the signal was not a stellar object or even a location in the universe, but it was like it was uniformly distributed over an area of about one sixteenth of the outer sphere of the universe. Hard to miss, therefore. The signal hadn't been there before.
It didn't take long before they cracked the code, to be more precise, the NSA did it. The CIA took it from there. They put a couple of bright kids onto the puzzle. The code was simply a sequence of zero's and ones. They discovered that the number of binary digits was the square of a prime number. When the digits were arranged in a square so that a side was equal to that of that prime number, the resulting box of 0's and 1's formed an image. Kind of like a fax.
There were quite a number of those images, and they were repeated all the time. It took about a month for all the data to be received.
Together, they formed the schematics of a rather simple looking electronic device. The images were annotated with a kind of symbolic mathematical script.
It didn't take long before also that was deciphered as well. Some symbols pertained to atom numbers, indicating the composition of the materials. Some symbols were measurements expressed in wavelengths of Hydrogen.
To make a long story short: The US government, even the President himself, were consulted. The CIA wanted to build the device. Experts said that it could do no harm, it seemed to be some kind of simple receiver that apparently didn't even need a power source. Although it was unclear what
exactly it would receive.
They built it and as soon as they turned it on, it outputted a similar stream of binary digits, only much more than had been originally received by SETI, and in much higher resolution and with a much higher frequency. It was like they had hooked up a Hard Disk, transferring many megabytes of data in an endless loop.
Everybody thought it was a hoax. Investigations were launched. They got people from outside the team to find malversations. They put the receiver in a cage of Faraday, and it kept on receiving. They let another team assemble a similar device, which worked just as well and produced the same output, synchronously to the last bit, with he first device. They took it into an abandoned coal mine, and it kept working. Components were replaced with equivalent components and it kept working.
People were getting nervous. The output of this device looked like the blueprints of some kind of transceiver. It would take a while to build it. Some of the components were hard, possibly impossible to manufacture. One of the components was a diamond ball of around 2 cm in diameter. They would have needed to steal the Koh-I-Noor and polish it into a rather dull marble, in order to comply with those blueprints.
And that is exactly what they did. Synthetic diamonds of that size were impossible to manufacture. The team who did it, was immediately afterwards assasinated by a "special" team. And the assassins themselves disappeared shortly afterwards. The diamond was polished and the result was referred to as: "The marble". The engineers thought it was made of glass. For increased security, ten identical devices were assembled, nine of which used an ordinary quarz marble.
When the thing was finished, a small team of NSA mathematicians and linguists was assembled. All were in their mid-fourties. None of them was married. They were given the neccessary security clearance. They were informed about the need for total secrecy and the consequences of breaking secrecy. Curious as they were, all of them agreed. They would work in duo's, one mathematician, one linguist, taking turns on the thing that was presumed to be a communication device.
The machine, modest in size, was hooked up to the serial port of a computer. It only had one input or output (whatever it was), a copper rod, protruding from the base. The device itself was simple in construction, but the dimensions were extremely accurate and the materials were rather exotic. The spherical diamond was partly embedded into two blocks of Rubidium and Osmium alloy, joint at polished sides in the middle. The thing was like a crystal-earphone, only more solid and a tad more expensive.
The base was connected to the ground of the PC. It didn't say that in the instructions, but it seemed a good idea.
Only two mathematicians were allowed to do the initial testing. A NSA programmer had made a simple program with which mathematical symbols could be converted into zeros and ones, and put into the device. And whenever similar pulses would be received, those pulses would appear as their corresponding mathematical symbols on the screen. The "LaTeX" symbols the scientist were familiar with were translated in the deciphered symbols used in the original blueprints.
The most senior of the two was called McFarlow.
McFarlow started to type. Normally a touch-typist, he now used one finger at a time, slowly, hitting each key squarely on the head.
"This is my version of 'Who are you'", McFarlow whispered hoarsely, as he entered the symbols for:
"Outside This Collection Equals"
There was an almost unperceptible delay as the following appeared:
"Collection Sum Collections"
McFarlow seemed to hold his breath. In fact he inhaled very slowly through his nose, trying to remain calm.
He glanced around the periphery of the computer screen, as if to see if he wasn't tricked in some way.
This wasn't happening. If this was a joke, he wasn't sure he had the sense of humor to appreciate it.
He had just asked "Who are you", in mathematical equivalent symbolic notation. Or at least, that was what he intended to ask.
The answer could only be interpreted as: "I am somewhere that contains where you are" or, more dramatically "I include/encompass everything".
"This is impossible" he said.
"Whoever we are talking to, a signal can't go faster than light. The furthest anyone can be who's answering this, is the Moon."
"What about faster-than light, like gravitational waves?" Wayne Drury suggested.
"We don't know how this thing works".
McFarlow unfocused his eyes for a moment in thought and then typed:
"Outside This Collection Position Equals"
"What does that mean?".
"I try to ask where this comes from, where he, she or it is located".
The answer was immediate.
"Outside All Collections Largest Sum Collection".
Wayne said nothing.
McFarlow looked intently at the screen and didn't blink an eye.
"Time Min Max Equals" he entered.
Again, the answer appeared barely after his finger released the ENTER key.
"ONE True Inside Collection Zero Towards Positive Infinity TWO True Outside All Collections Negative Infinity Zero Positive Infinity"
Wayne looked at him and before he could ask, McFarlowe said with a hoarse voice: "Shhst! Let me look at this".
He frowned as he made a smudge trail with his finger as it moved, word for word, over the text on the screen.
After a while he said with a sigh: "I asked about when it came into existence, when it would cease to exist". I think it means this:
"1. For us here, it has existed since the beginning of time and will exist until the end of time."
"Funny how it doesn't use "Infinity minus one" but simply 'infinity'.."
"2. For itself, it claims it has always existed and will always exist."
Wayne, incredulously but highly exited exclaimed: "So, what you are saying is.. We're talking to God!"
"It appears so". McFarlow looked very serious. Solemn, almost.
"Jezus Christ.." Wayne said.
"I don't believe it until he proves it" McFarlow said.
"I just don't believe it".
"Ask it a question" Wayne suggested.
"Like what?"
"Well, is it almighty, how's it like in heaven, that kind of stuff.." Wayne replied. Wayne was a religious person.
"I'm not sure how to translate that into mathematical notation". McFarlow obviously was losing his patience but seemed intrigued at the same time.
He started to type.
"Unknown True: Element A Inside This Collection Towards Not Element A In This Collection Equals Element A Inside Collection Sum Collections".
Wayne, the linguist, wiggled his eyebrows. "I'm afraid you've lost me here".
McFarlow hadn't hit the ENTER key yet.
"We have to hurry", he said. "We don't know how long the connection will last".
"I am asking whether there is a heaven, more exactly, something after we cease to exist, after our universe ceases to exist, is there anywhere we will still be". He pressed ENTER.
Again, the answer was instantaneous.
"Not True".
"At least it seems to understand what I mean", said McFarlow.
Wayne was visibly disappointed . He tried to hide that fact that he was taking offence.
"Ask whether it is almighty".
"Do you have any idea on how to formulate that?"
"Something with the collections?" Wayne suggested. "Like, can he make our world, our 'collection', disappear?".
McFarlow didn't need much encouragement. He typed:
"Unknown True: Collection Sum Collections Statistical Probability
GreaterThan Minus Infinity Towards Not This Collection".
ENTER
"True".
"The answer is yes", McFarlow said.
"Oh".
"Are you sure?".
"I'm not even sure this isn't some sick joke".
"Ask it whether it knows everything, whether it is all-knowing".
"Unknown True: Collection Sum Collections Includes Equation Solved True Sum Actions This Collection"
"Not True"
"Hm", said Wayne.
"So he claims to be God and to be able to be all-powerful or at least able to destroy us, but there is no heaven and he does not know what is happening here?"
"Nobody was ever sure about the true nature of God" said McFarlow.
"It's obviously better to ask him directly".
"A pity he doesn't speak normal English" Wayne joked.
"Can't we send him some pictures, ask him what he thinks about starving African AIDS orphans, that kind of stuff?"
McFarlow looked at him.
"That's not such a bad idea at all. We could send it in the same format as the blueprints. Digitized pictures, black & white, as binary digit in squares with the same side length as the blueprints we got".
"OK then, let's do it", Wane said.
They Googled some pictures of starving children, victims of the Hiroshima bombing and a particularly gruesome image from the Holocaust. They scaled the images to the right size and converted them to monochrome.
From the network, they prepared to send the images to who- or whatever was on the other side.
"When were are really talking to God, I would like to see what he has to say about THIS" McFarlow mumbled. It would only take a second or so to send them all and they accompanied the images with the following question:
"Unknown: Sample Here Collection Towards Collection Sum Collections Unknown True False".
Both mathematicians realized there was no way of expressing an ethical judgement, of expressing "good" or "bad" in mathematical language. They needed a more subtle means of communication. But whoever was on the other side seemed to be able to make sense of their symbols, at least for now.
They proceeded with sending the images, which took only a second or so.
But there came no answer.
They tried a few more questions but there was no response. They thought they had perhaps overloaded the tranmitter.
Later that day, the engineers put the diamond marble into one of the nine spare tranceivers and McFarlow and Drury were allowed to try again. No result.
They handed over the records of communication, and they were reproached for having sent the images.
Another team took over, with NSA minders this time.
The line was dead.
McFarlow suggested offering an apology, which he creatively formulated in LaTeX:
"True True True: This Collection Element Equals Minus Infinity This Collection Element Equals Not True Not True Not True"
A tear ran over his cheek as he typed it. The only chance for a dying Earth to be elevated to a higher level - technologically as well as spiritually - forever spoilt by a reckless demonstration of the worst of humanity?
An answer never came.
Reached Clockcycle Level
The tactics code is almost finished (1000+ LOC but a lot of that is due to LF's and unrolled loops), and the more I work on it, the faster it becomes.
I have a kind of ADHD, and often, the only way I can design an algorithm is to get away from the keyboard. Depending on the finances and the weather, it's either the pub or the tub :-) Today I walked to the harbor and designed a simple stack-based floodfill algo over a beer. The challenge was splitting empty chains most efficiently by re-using some of the cleverly-represented data, so that I do not have to floodfill both parts of a dual split.
Then I went home and hacked it up.
I keep discovering ways to get that code faster. Letting small arrays start on a cache line is an obvious speedup, but at the moment I am way beyond that and am using AMD's instruction latency table to decide whether a SHR with an AND could be replaced by a table lookup to shave off one cycle, that sort of thing. A pity I have to hit the sack, because I just found yet another dramatic speedup trick that I'm eager to implement because it also makes the source smaller and easier to read (the source looks, um, extraterrestrial).
Of course, the resulting 64-bit MASM assembly that Free Pascal generates will have to be hand-optimized by yours truly, but that will be low-pri. First the thing has to actually work, and it needs stuff around it to automate it as well. No need for undo code: I use a state stack that, due to minimalistic data representation, fits a deep treesearch entirely in the L1 cache.
Someone suggested that my business goal should be perfecting the Help file, but after years of work on Moyo Go, I need to re-invent myself, not to mention the need to do the hardest things when one's brain is youngest. There is an optimum between experience and number of deceased neurons, and it shifts leftwards with algorithmic complexity :-)
If Moyo Go wants to compete with the Big Boys and, as is the goal, ultimately dethrone them, I really have to bootstrap myself out of the swamp of bugfixing ever less serious bugs and perfecting things that are already pretty good. Nothing kills a business faster than lack of innovation. Moyo Go was never intended to be the best Go databasing/pattern system, it got that status because the Joseki library module turned out to be a super-powerful pattern expert system, so successful that even Microsoft tried to copy it. But my goal is to make a dan level Go program, and, heaven permitting, I'll succeed if I can keep away from that Help file :-)
I have a kind of ADHD, and often, the only way I can design an algorithm is to get away from the keyboard. Depending on the finances and the weather, it's either the pub or the tub :-) Today I walked to the harbor and designed a simple stack-based floodfill algo over a beer. The challenge was splitting empty chains most efficiently by re-using some of the cleverly-represented data, so that I do not have to floodfill both parts of a dual split.
Then I went home and hacked it up.
I keep discovering ways to get that code faster. Letting small arrays start on a cache line is an obvious speedup, but at the moment I am way beyond that and am using AMD's instruction latency table to decide whether a SHR with an AND could be replaced by a table lookup to shave off one cycle, that sort of thing. A pity I have to hit the sack, because I just found yet another dramatic speedup trick that I'm eager to implement because it also makes the source smaller and easier to read (the source looks, um, extraterrestrial).
Of course, the resulting 64-bit MASM assembly that Free Pascal generates will have to be hand-optimized by yours truly, but that will be low-pri. First the thing has to actually work, and it needs stuff around it to automate it as well. No need for undo code: I use a state stack that, due to minimalistic data representation, fits a deep treesearch entirely in the L1 cache.
Someone suggested that my business goal should be perfecting the Help file, but after years of work on Moyo Go, I need to re-invent myself, not to mention the need to do the hardest things when one's brain is youngest. There is an optimum between experience and number of deceased neurons, and it shifts leftwards with algorithmic complexity :-)
If Moyo Go wants to compete with the Big Boys and, as is the goal, ultimately dethrone them, I really have to bootstrap myself out of the swamp of bugfixing ever less serious bugs and perfecting things that are already pretty good. Nothing kills a business faster than lack of innovation. Moyo Go was never intended to be the best Go databasing/pattern system, it got that status because the Joseki library module turned out to be a super-powerful pattern expert system, so successful that even Microsoft tried to copy it. But my goal is to make a dan level Go program, and, heaven permitting, I'll succeed if I can keep away from that Help file :-)
Sunday, June 04, 2006
Hyperfast Empty Area Splitter
Everything about Moyo Go is "speed", and especially the TsumeGo module should be fast.
A useful thing to speed up L&D analysis is the concept of "empty chains", but it can be a time-consuming process to detect whether a move split a chain into up to four new ones.
The computer Go mailing list archive had a few interesting and speedy solutions to this problem, but I kept having some nagging objections to even the fastest proposed solution. I don't know whether it was luck or wisdom, but I ran into a much faster solution. To determine whether a split occurs I have to look only at the data in one single cache line! And the number of operations required is so small that it seems close to the theoretical minimum.
The split itself typically takes very few clockcycles and happens in less than a handful of cache lines as well. It's achieved by a combination of minimalistic data representation, advanced boolean algebra and a thorough understanding of the CPU.
A useful thing to speed up L&D analysis is the concept of "empty chains", but it can be a time-consuming process to detect whether a move split a chain into up to four new ones.
The computer Go mailing list archive had a few interesting and speedy solutions to this problem, but I kept having some nagging objections to even the fastest proposed solution. I don't know whether it was luck or wisdom, but I ran into a much faster solution. To determine whether a split occurs I have to look only at the data in one single cache line! And the number of operations required is so small that it seems close to the theoretical minimum.
The split itself typically takes very few clockcycles and happens in less than a handful of cache lines as well. It's achieved by a combination of minimalistic data representation, advanced boolean algebra and a thorough understanding of the CPU.
A Philosophy of Quality
This is the car I bought last week, a 1990 Saab 900i v16.
I posted this not to "show off" (the thing set me back a mere 1,500 $) but to make a point: Saab automobiles are pioneering marvels, built according to a philosophy of quality that fits that of my own. Saab originally made military fighter jets, and this is especially noticeable in older models.
These cars are fast, luxurious and very comfortable, just like Moyo Go Studio (-:
Those cars have a comfortable cruising speed of 100 mph, wonderful details like soap-spraying headlight wipers, heated chairs and electrical sunroof but most important for me is the humongous cargo hold when the back seats are lowered (my car is longer than the one in this picture and has those typical isolated tiny back windows, I bought the 5-door model but the color is exactly the same, only much darker :). In Stockholm there is a Saab 900 taxicab that has a million km on the clock and it still has the same engine!
This blog entry illustrates another important point: That it is possible for a single person with very limited means to produce high-end, state-of-the art software that can compete at the highest level. Take the Swedish Saab Gripen (photo). Sweden, a country of eight million people, builds jet fighters on a par with the Eurofighter and can even compete with the Joint Strike Fighter. The Moyo Go Studio / SmartGo competition is similar. MoyoGo is made by someone with almost zero funds, SmartGo is made by a multi-millionaire, but that is often a liability as well, and I think ultimately, it will be its demise because Anders simply buys whatever is hard to program. He has bought the source code of his TsumeGo module (Thomas Wolf's GoTools), and he has bought the sourcecode for his Joseki module from MasterGo. The result: An unmaintainable hodgepodge of "not invented here" stuff that will gather dust and rust.
In the long run, Moyo Go will win because it's made of higher quality, in-house parts.
I posted this not to "show off" (the thing set me back a mere 1,500 $) but to make a point: Saab automobiles are pioneering marvels, built according to a philosophy of quality that fits that of my own. Saab originally made military fighter jets, and this is especially noticeable in older models.
These cars are fast, luxurious and very comfortable, just like Moyo Go Studio (-:
Those cars have a comfortable cruising speed of 100 mph, wonderful details like soap-spraying headlight wipers, heated chairs and electrical sunroof but most important for me is the humongous cargo hold when the back seats are lowered (my car is longer than the one in this picture and has those typical isolated tiny back windows, I bought the 5-door model but the color is exactly the same, only much darker :). In Stockholm there is a Saab 900 taxicab that has a million km on the clock and it still has the same engine!
This blog entry illustrates another important point: That it is possible for a single person with very limited means to produce high-end, state-of-the art software that can compete at the highest level. Take the Swedish Saab Gripen (photo). Sweden, a country of eight million people, builds jet fighters on a par with the Eurofighter and can even compete with the Joint Strike Fighter. The Moyo Go Studio / SmartGo competition is similar. MoyoGo is made by someone with almost zero funds, SmartGo is made by a multi-millionaire, but that is often a liability as well, and I think ultimately, it will be its demise because Anders simply buys whatever is hard to program. He has bought the source code of his TsumeGo module (Thomas Wolf's GoTools), and he has bought the sourcecode for his Joseki module from MasterGo. The result: An unmaintainable hodgepodge of "not invented here" stuff that will gather dust and rust.
In the long run, Moyo Go will win because it's made of higher quality, in-house parts.
Friday, June 02, 2006
Freeware Editor Headaches
Yesterday night I spent a few feverish hours trying to make an update to the Freeware SGF Editor (to add translations, multiple gobans and drag & drop skins/files).
Result: I destroyed the sources for the freeware version. The problem is that maintaining the same sourcecode for two versions is complex. MoyoGo is an enormous application and I have to disable thousands of parts with conditional compilation compiler directives, but I also have to physically remove all GUI widgets that are not used or hide them, whatever is appropriate.
This process is labor-intensive, and as I now know, error-prone. I simply have no time to do it, the coming months. It's too much stress, I need to concentrate my efforts on the tactical module and a few other things that suck up time and that are much more important at the moment.
Result: I destroyed the sources for the freeware version. The problem is that maintaining the same sourcecode for two versions is complex. MoyoGo is an enormous application and I have to disable thousands of parts with conditional compilation compiler directives, but I also have to physically remove all GUI widgets that are not used or hide them, whatever is appropriate.
This process is labor-intensive, and as I now know, error-prone. I simply have no time to do it, the coming months. It's too much stress, I need to concentrate my efforts on the tactical module and a few other things that suck up time and that are much more important at the moment.