The challenges of writing a Delphi/Object Pascal to C++ compiler.

I am seriously considering writing a Delphi-to-C++ compiler that is 99% compatible with Delphi XE6.

I’m finding FreePascal to be a bit frustrating, in different ways than the Delphi compiler is frustrating.  Embarcadero’s 64-bit compiler has recently profoundly let me down with it’s lack of optimization, while FPC is a maze to navigate and I keep getting random “Fatal: Compilation Aborted” messages that offer no indication to the what the real issue is.

I like to complain a lot that Delphi’s generic support is limited and incomplete, well… I’m finding that FreePascals generic support is even more limited and more incomplete, and the popular Lazarus IDE doesn’t really fully embrace the 2.7.1 compiler in its latest version (am I forced to run an older branch?).   I have a lot of code that relies on the elegance that generics can bring to the table. I’d use generics more if either language had a robust implementation of generics. C#, for example, makes great use of generics. FPC supports Generic Classes, but not generic methods at this time. Delphi supports generic methods, but only when attached to a class, there are no generics allowed on global functions. Furthermore there is no proper way to properly reference a generic record because records do not support Interfaces and Interfaces are the way that constraints are expressed in Delphi Generics. I’m tired of a lack of progress by Embarcadero and the community really. I can do this… and it can be good.

Below is just a brainstorm of the various challenges that I would be up against in writing such a contraption.  If you’re adventurous and prolific and want to get the most out of your pascal code… drop me a line and join my project….  I’m just getting started.  I have a 4.3 million line codebase to protect, so having a complete implementation is going to be quite important.

I think there are certain fossils I’d toss out, like the old Pascal calls and syntax that used to be part of the built-in library such as  “AssignFile” and “WriteLn” commands. They are rooted in the archaic. My logic is that if you’re using those calls, you should probably be upgrading them to using TfileStream.

Eventually I’d like to add some features that I think are sorely missing from the language while paving the way for seriously hardcore optimization.

The Delphi/Object Pascal language is structured in such a way that it CAN and SHOULD BE as fast as C++. Since there are no compilers out there that compile Delphi/Object Pascal as fast as C++… I figure, why not just compile it to C++ anyway and let all the R&D that has been put into the open source compilers over the years dictate how fast the end code ultimately runs?

There are just a few considerations…

Consideration #1. The “Volatile” keyword.
C++ has the “Volatile” keyword which is missing from Pascal. The volatile keyword plays a key role in optimization. In C++ any variable or call can be optimized, even executed out of order, and eliminated unless it references volatile memory locations. This level of optimization makes a huge difference at run-time (although it makes it practically impossible to debug). 99% of the time, just about any variable can be considered non-volatile and in C++ and that is the default for any variable. However, if you mark a variable as volatile, then it tells the compiler “hey, even though I just read this from that memory location, there’s a chance that something else could come along and change the value after I read it, so it is probably important to try to always read the most recent, freshest result from that location.” In the Embedded systems world, these volatile variables most often point directly to system hardware registers, flags, etc…. but in a multi-threaded environment, it is also important to mark things volatile that other threads might come along and modify… basically any cross-thread communication. For example if I write code in one that that checks for a flag set by another thread, the optimizer might not ever see the flag change in the other thread because it has optimized it, put it into a system register and forgotten about the original memory location.

Consideration #2. The Best of both worlds.
C++ has a more mature class structure. It supports constructors/destructors in immutable objects (e.g. records). It also has better operator overloading support, multiple-inheritance, and probably a few other features that Delphi/Object Pascal lacks, but for the most part Object Pascal and C++ possess the same language features and are built on the same paradigms. I’d say the overlap is 98%. Delphi, on the other hand, has a few things that C++ lacks: unit initializers/finalizers, a native set of string types, simple but ridiculous integration with windows messages (the “message” keyword), Properties, and a generally more friendly and readable syntax throughout. I’d like to create a variation of Pascal that capitalizes on the strengths of both langugages and is as fast as C++ can be underneath.

Consideration #3. Delphi compiles FAST.
Workflow is immensely important. So if your program has to be compiled in two stages, then we’d also run the risk of our app’s compilation speed being dictated by the speed of the C++ compiler underneath. This is unacceptable.  But my observation is that the only reason C++ takes as long as it does to compile is because of poor design decisions that have carried through since C was created. The inherent problem with C++ compilation time stems from the fact that the #include keyword can cause files to be compiled, and recompiled, and recompiled, and recompiled. This is largely due to the rule that use of a #define keyword above the #include can potentially affect the compilation of the file being #included. By rule, therefore, the #include file must be recompiled for every C file in your project. To combat this problem, I propose that we compile the .pas files into .cpp files in such a way that they are either 1) contained in just one .cpp file that does not #include any other file, or a set of files with a similar goal of reducing recompilation of redundant files. With this in mind, I think that the C++ compilation stage will add only a negligible amount of compile time to the end product.

Consideration #4. Object Life Paradigms
C++ supports constructors/destructors on classes which behave more like delphi’s record types. To use a C++ class in a manner similar to a Delphi Class, it must be turned into a pointer. All Delphi Classes behave like pointers, but sometimes it is also nice to have classes behave in a manner similar to C++ classes. I’d like to upgrade the language to support 3 different styles of object lifetime management: 1) Typical Delphi style with “create”… “free”… etc. 2) immutable types that are  stack-allocated similar to C++ and delphi’s Record type for when you want to create complex types that behave like regular variables 3) Automatic Reference counting similar to The Delphi Mobile compiler, C#, and Delphi’s implementation of Interfaces. I figure, why am I limited to “records” being one way, classes being another way, and interfaces being a 3rd way? Why can’t all these structures be whatever way I need them to be?

In order to organize my thoughts, I’ve decided to make a list of the prominent langugage features of Delphi that would need to be considered/translated to make this whole contraption work. Am I missing any? Chime in if you have something to contibute.

1. [EASY] Class translation
2. [EASY] Generic Translation
3. [EASY] Namespace/unit resolution
4. [EASY] Function Translation
5. [MODERATE] Specialized reference counted string handling.
6. [MODERATE] Translation/adaption of the hidden “system” unit.
7. [MODERATE] Const buffer and var buffer nonsense
8. [MODERATE] Ancient functions that accept variable numbers of parameters (likely unsupported)
9. [EASY] For-loop translation
10.[Easy] While-loop translation
11.[EASY] try-finally implementation
12.[EASY] try-except implementation
13.[MODERATE] general exception handling challenges
14.[MODERATE] method hooks and function pointers
15.[MODERATE] anonymous methods
16.[MODERATE] DElegate methods of object
17.[HARD] ActiveX/OCX/TLB support
18.[MODERATE] DLL imports/exports
19.[MODERATE] DLL Delayed loading.
20.[HARD] Interface Support and generic constraints
21.[MODERATE] Immutable structures records and packed records
22.[HARD] Integration with the MESSAGE keyword and message pipeline.
23.[HARD] DFM support.

9 Replies to “The challenges of writing a Delphi/Object Pascal to C++ compiler.”

  1. Hi Mr Nelson,

    I’m glad to find your blog. I’m also a Pascal lover. And I understand your frustation to Pascal compilers quality in the last decade, because I felt that too. I’ve been expecting a compiler expert will come up and see the urgency to build a much better and a lot faster Pascal compiler than currently available Pascal compilers.

    The most prominent Pascal compilers today are only the expensive close sourced Embarcadero’s Delphi and the open sourced Free Pascal. There are also RemObject’s Oxygene and Smart Pascal, but I don’t think they are as popular as Delphi or FPC. I’m not a compiler guy, I almost know nothing about compiler, I’m just a “user” programmer. However, I know what a good (Pascal) compiler should have done. It should be fast, optimized, and stable. And I don’t think both Delphi or FPC are as good as C++ compilers. Many people with proper knowledge have shown me the evidences.

    What I have wanted from a Pascal compiler today are:

    1. Modern in every senses; the language features, the supported platforms and architectures, the optimizations, etc. I want to be able to use Pascal on literally anything codeable, from servers to smartphones, from desktop app to web app, from game to scientific program, everything.

    2. Make the default string type as unicode string and the compiler accepts source code in unicode (like Apple’s Swift). The old Pascal’s 1-based limited length string is archaic. The new Pascal string should be compatible with C/C++’s string (it has been a problem for over 30 years and yet still no solution!), unicode ready, yet still elegant.

    3. It should be open source and free. There are so many good compilers and interpreters and IDEs out there that free and open source. Even Microsoft free their compilers and IDE now. If Pascal wants to be popular again, then it should be free and open source too.

    4. Though it has to be modern and embrace new paradigm, I hope it would also be able to compile legacy codes, with little to no modification required. Admit it, there are so many Pascal/Delphi programmers out there. If you leave them out, I think they will ignore you. I think that’s why Embarcadero is still alive despite their expensive and greedy pricing scheme. Because the legacy codes still need to be maintained. That’s also why I think the write/ln and read/ln still should be kept. Yes it’s archaic but in a good way. I’d say it’s classic, it’s the trademark of Pascal language. It’s the spirit of Pascal: simplicity.

    5. Automatic memory management. Modern compilers should be smart. There are many things that can be done automatically to make writing code easy and fun, yet still produce high quality codes. But keep it native, don’t go to “managed” solution a.k.a garbage collector system like C# and Java. Instead, I propose it to be like Apple’s solution, ARC (automatic reference counting).

    6. Make the code generator modular so other people could build their own code generator that targets other platforms, such as Smart Pascal is able to generate javascript code. Today the web platform is very popular, you know. 🙂

    7. I love Pascal as much as the next guy. I hope Pascal compiler should be written in Pascal as well. But, if that can’t be achieved for various reasons (cross platform support for example), I think I can live with 2-steps compilation to another language first. I just want to make sure that the intermediate language would not become nor limit the power and beauty of Pascal language. If the new Pascal variant only respects Pascal on the syntax level, I don’t think it’s a good reason to build another Pascal compiler. I even think it’s wrong since the beginning.

    I like your idea and vision. So, yes, I’d like to join your adventure. Though perhaps I couldn’t help much on the code writing, I can help on many other things. Because I’d like to see the new Pascal compiler that I can use with pride. 🙂

    Drop me a line. Thank you.

    Regards,

    -Bee

    1. would you be willing to look over a video presentation I am preparing (once it is complete). I’d appreciate your feedback.

      – Jason

  2. Hi, I just found your blog by change in Google.
    I’m actually currently developing exactly this, a Object Pascal to C++ compiler, started like some months ago.

    I’m doing this mainly because I use Object Pascal for lots of platforms, and even FPC is lacking in that area.

    Many of your points I’ve got them covered, others I am not really interested at all, at least for now, since I don’t need them.

    Also you put function translation as easy, but you forgot about nested functions, which do not exist in C++.

    However right now I’m focusing on the features that my code uses, and not all features supported by the language.

    If you are interested in discussing this contact me.

    1. Hi, sorry, I haven’t been terribly active on my own blog. I would definitely be interested in discussing a Delphi C++ compiler project. The compiler I envision would not be a “translator” so much, and therefore the lack of nested function support would be a minor issue. I could simply compile the nested function inline with the outer function. The variable scopes could be local to the inner squigglies {} and/or prefixed. Currently I am finishing up another project, and I’m working on a presentation that I hope might help me get some funding. It would be a lot easier to build this thing if I could get a little time off. C++ doesn’t support quite a few things that Delphi supports, for example: Delegate methods is a pretty big one that comes to mind… IMO a bigger issue than nested functions. My hope is to achieve parity between the two languages eventually and make object pascal a 21st-century C++ replacement.

  3. Hi JNelson and PascalCoder,

    So, how does the development go? PascalCoder, would you like to make it open source? Would you share it on github or sourceforge?

    Thank you.

    1. I am undecided about the open-source thing…. possibly. Making it open source would require me to detach it from a few things that, for licensing/intellectual property reasons, I am not allowed to publish open source… so I’d have to spend some time organizing the parts that can be made open (99% of it) from the parts that can’t.

      1. I’ve always wondered how complex a task it is to convert a particular language into another and seems like the consensus is that it’s indeed a handful. Does your approach to “inner squigglies” involve any kind of recursion? I’m assuming managing variable scopes should be straightforward, but how about maintaining a proper stack trace or implementing features like closures?

        1. Great point, Nathan. Managing variable scopes should relatively be straightforward, however, maintaining a proper stack trace could indeed add a complexity layer. As for closures, how they’re implemented would largely depend on the memory management strategy and mutable state handling of the resulting C++ code. Given the inherent disparities between the two languages, striking the right balance between translation accuracy and efficiency would indeed be a feat of some technical prowess. Generics, exceptions, and concurrency features would also add their own flavor to the challenge pot.

          1. You’re right, Henry. Maintaining a proper stack trace might be complex. I’m curious how concepts like closures will be addressed too, especially since they’re such a staple in modern programming and can be tricky when bringing over from a language like Delphi. I bet that’s gonna be one of the more nuanced parts of the compiler, right there with generics and threaded execution. C++’s existing infrastructure for these concepts does differ quite a bit from Delphi’s after all.

          2. Generics, closures, and threading have nuanced translations between languages, indeed. They hinge on deep language integration, not just syntax – that’s where it gets really tricky.

        2. Converting one language to another is like translating Shakespeare into Klingon. Tricky and rarely clean, especially with nested goodies!

  4. Sorta tough to say:Blackberry: Java ME (Micro Edition)iPhone: Objective CPalm Pre: HTML, CSS, JavaScriptPDA w/Windows Mobile: Embedded VC++What you do is look for the devices’ Software Development Kit (SDK). Different SDK’s will be werittn for specific devices. You need to use the manufacturer’s SDK (and programming language) in order to access the full functionality of each individual device. Wouldn’t it be great if all of em used the same language SDK/API ?! Too bad for us developers!Good luck.

Leave a Reply to Bee Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.