Assembler, Part 1
I don't yet have a name for the Assembler language program-thingy I'm doing so I'm just going to refer to it as Assembler (giev suggestions!) for the time being. Cool? Cool.
As mentioned in a previous post, I'm taking time out from my commercial software and am instead working on fun little projects. Clearly, our definition of the word fun differs greatly.
I wrote a Turing program a long time ago and recently it occurred to me that I could get a more advanced version by essentially emulating a CPU. The Turing machine is pretty much what that is. In this case, I'd be going with a Motorola 68000. Because it's lovely, and because it seems simple enough as compared to the nightmares that are the x86 and x68 architectures. Bear in mind that I don't really know much about assembler/assembler other than tinkerings back in the late 1990s.
Most of the time so far has been spent creating a lexer (syntax highlighter) for Scintilla. Being a .NET OOP guy and working with Scintilla.NET (which doesn't attempt to really refactor anything), its API is a complete mess. So it took me quite a while to get highlighting happening, and it's still not done. Thankfully there's a project called ScintillaNET-Kitchen that helps out.
I haven't gone for a multi-document interface or anything that you'd find in more complete IDEs - how do I even know if this project will work satisfactorily? Hell, saying that, I've (partly!) made it possible to switch in new CPUs. I think that itself is a bit much and so don't currently plan on fully adding support for such a feature. Hmm... I could also create a chipset with multiple chips...? No - NO. Stop right there.
Yeah, alright - what the eff is the point?
Fun. As said above.
I'm going to keep adding to it. Forever. CPUs don't tend to do a whole lot on their own; their instruction sets aren't very big, either. Lots of shunting of data all over the place. Of course this is simplified and from someone that doesn't really assembler-me-do, but how hard can it be?
So, if I can get the basics working then I can start adding built-in routines that a program can call to do more advanced stuff. A video display would need to be added, of some sort. Only being able to manipulate registers isn't the most exciting thing after the initial novelty wears off.
Shut up and tell me if it's done yet
Hell no it isn't done. It doesn't even parse the entered code yet. There's the editor with all its syntax highlighting tomfoolery, and there's the
There's a screenshot below. That's all there currently is. Will still if I have time to get the parsing - at least started - in today.
Edit: Did the opposite of what I said and have continued working on the interface. Oops.
Assembler, Part 2
Part number the two!
Did some more work on the Assembler project (part 1 here) that didn't just involve improving the interface. Ooh, ooh - what?
Parsing. That's what.
An example of which is:
Each line in the source is read one-by-one. Once a line is read, a regular expression
Note that this expression is far from complete. First change would involve removing the rigid white-space structure.
The reason for the
Anyway, back to parsing.
If the line fails the regex match, then onto the next line. If it succeeds, the parsing continues and an
Undecided if mnemonics should inherit from a baseMnemonic and be individual classes, or if I should continue with the mnemonic simply being an enum on
Thinking further ahead, I'm also undecided if the entire assembler source should be parsed in one go or instead proceed one line at at time. The latter will allow code to be edited while the program is running, which is quite nice, and goes hand-in-hand with the Step debugger feature.
Right now, I can't think of any advantages of parsing in one go (parse each line and populate
Assembler, Part 3
I've gone ahead and added the BaseMnemonic class, along with a IMnemonic interface that all CPU mnemonics (instructions, op-codes - whatever you want to call them) inherit from.
The past hour or so has been spent improving the interface and adding things where necessary. The main window now features has the standard set of root menu items (File, Edit, View, Debug, Tools, Help) and there's a (currently collapsed) project viewer.
Added a Messages window that will show To-do items, syntax errors, and the like. Also added Status Register to the Registers window. The tooltip for each register's value also displays the value of that register's contents in base 10 (Decimal) and likely base 16 (Hex) in the near future, too.
Next up is making use of the BaseMnemonic class by implementing the
Assembler, part 4
What am I doing?
I've once again resumed working on the interface rather than the actual core; I still don't even know if any of this truly works. Sheesh.
Not a whole lot to report from yesterday as I practically spent the day reading while doing the odd bit of UI tinkering. Big thanks to Rob for reminding me about SyntaxBox! Scintilla is pretty horrible overall and SB is done so much better, so I'm glad the editor is now more solid. Code folding was extremely easy to implement, unlike Scintilla where I just couldn't be bothered and worked on another part instead.
Today I started a new control for displaying parsed instructions. Think of it as a ListBox but with no user interaction, and the "current item" is always vertically centred and highlighted; currently adding column support to tabulate the view. Oh, issue: I can't get the background transparent no-matter what. I've done millions of transparent controls in the past, so this is confusing the hell out of me.
Next, I need to have a go at actually executing instructions. I've done very little bit-work in .NET so I'm not sure how that's going to go, especially as I don't even know the syntax for VB.Net; presumably it's going to be bat-shit insane compared to C#.
Assembler, part 5
In the previous version, Assembler parsed down the entered code into
Now that property has been removed and all
All mnemonics for a CPU are defined within a
Some limited parsing is in for determining what the
Because of this, each
On the interface side of things (c'mon, like I can resist UI work) the syntax highlighting now colourises the datatype (or size as is the actual term) for mnemonics. Turned out this was achievable via making the datatypes an operator when it comes to highlighting as they don't need to be on word boundaries - thanks to Rob for the suggestion. Still more highlighting work required.
The Registers window now has a trace of what it parsed from the entered source code and displays the interpreted code. Changed the font to a mono-space one as the other just looked messy when registers' widths didn't line up. Sorry Segoe UI - not this time.
Assembler, part 6
Okay, okay; can't put off doing the operand parsing any longer if I want this project to progress any further.
Just like the (incomplete) line-by-line parsing of the entered sourced, I'm going with Regular Expressions. Everything is going to be regular expressions; regular expressions all the way down.
As there's no specific reason why the
There will no-doubt be more as I add parsing for additional features when I get the existing ones working. Types will certainly become more granular, such as specifying whether an
The next big step is to have the
Bit-work. All operations can work with either a byte, word, or long-word. Long-word is easy, but the other two will require bit manipulation of which I haven't really done much of in .NET.
Memory. Sure, the registers are currently just an
Initial thinking would be just to allocate an array that's a property of the
Assembler, part 7
Haven't quite gone ahead with actually executing instructions as yet because I re-worked the way a "machine" is implemented.
I ended-up ripping out the hard coded properties and features and made everything generic. There's now a set of
The potential issue here is that registers are now loosely typed. I'm not sure how to go about implementing the
When a new register is added to the
Anyway! If you wanted to add 8 Data registers, you'd do the following:
This will create 8 registers, all named D0 to D7. Basically, the index is applied directly after the base name.
Do CPUs only really have (general purpose) Data and Address registers? Could I replace the literal with an enum, instead?
Fetching a register? Like so:
With name being, say, "D6" to get the sixth Data register.
As I've (currently) implemented the
No changes have been made to mnemonics.
This currently just has a