Week 9: Submodule Implementation

Hi everyone, welcome to the ninth blog of my GSoC'25 journey. I dedicated this week to Implement Submodules in separate files for Monolithic Compilation (without Separate Compilation). So, in the previous week, we implemented Submodules in a single file and in multiple files but with separate compilation. But with that implementation, submodules were not working in multiple files without separate compilation. We identified the cause for it and decided to add a compiler option by which we can pass the required submodules to program and can load them.

So, the first thing I did was to create a Compiler Option named as --add-submodule whose purpose was to pass the name of required Submodules, provided as the command line arguments, to our main program. I also created a vector of strings to collect all these submodule names so that I can use them in future to load these submodules.

After doing this, I came to next part of the implementation. For that, in ast symbol table visitor, I iterated over the vector containing submodule names, processed them a bit and then used this submodule names to load these submodules. But this created a problem in deserialization as we were not able to read Parent Module Symbol while loading our submodule. Initially, I tried various things to get rid of this like, I tried to deserialize Parent Module. I also tried to modify Parent Module Symbol using ExternalSymbol as we where having some special handling for ExternalSymbols in deserialization. The link to Pull Request for this overall implementation is Pull #8064.

But, none of the approaches was able to resolve this deserialization error. So, I decided to deep-dive into the problem to understand the exact issue. I was able to deduce that whenever we load a module, we create a new Translation Unit and load the module within that Translation Unit. So, when we were trying to read parent module symbol while loading a submodule, we were not able to do so as we weren't having access to parent module symbol as it was loaded into some other translation unit. With a discussion with Ondrej Certik, we agreed upon an approach to resolve this error. We decided to convert Parent Module flag of Module ASR to string from symbol which would simplify the deserialization process as now, we will not need to deserialize any symbol for parent module.

The Parent Module flag will now store the name of parent module for submodules and for regular modules, it will be empty string. This will also, help us to distinguish a submodule from regular module. After this, I updated the entire codebase with this change and with this, we were able to load our submodules into program. With the completion of this step, our submodule was getting loaded into global scope and thus, LLVM Functions were generated for our Submodule Functions. Other things were handled by the linker and thus, our Submodules got implemented in Separate Files for Monolithic Compilation. The Link to Pull Request for this complete implementation is Pull 8081.

Currently, the Pull Request is failing some CI checks due to some unknown reasons because same test are working for me on my local system. Also, as submodules were working perfectly with separate compilation, I decided to revert workarounds related to submodules for separate compilation on the top of latest separate compilation branch of stdlib so that we save progress for submodules (with separate compilation) for stdlib. I have started to work on this and have made some progress which can be tracked through revert-submodules-workarounds-sc-1 branch.

In the next week, my first job will be to complete Pull 8081 and get it merged. This would not be a diffcult task. Next, I will continue to revert submodule workarounds of stdlib for separate compilation and if something breaks after reverting, then will fix it accordingly. Also, I will save the progress by setting it up on CI. I will also do similar thing for stdlib submodule workarounds with monolithic compilation. This would robustify our work on submodule implementation for every possible compilation technique.

Overall, I worked for 26 hours this week and enjoyed the work that I did in the ninth week and would like to thank Ondrej Certik, Harshita Kalani, Pranav Goswami and all the other LFortran members for their reviews and suggestions which helped me a lot to tackle new difficulties. I am looking forward to continue my journey in the next week with the same excitement and enthusiasm and plan to complete my proposed tasks as quickly as I can.