ROSE Compiler Framework/FAQ

We collect a list of frequently asked questions about ROSE, mostly from the rose-public mailing list link

How to search rose-public mailinglist for previously asked questions?
Use the following command on google search

How to check the version of ROSE?
ROSE_Install_path/include/rose/rosePublicConfig.h /* Define to the version of this package. */
 * 1) define ROSE_PACKAGE_VERSION "0.9.8.54"

To check this in your code bool checkRoseVersionNumber(const std::string &need) { std::vector needParts = rose::StringUtility::split('.', need); std::vector haveParts = rose::StringUtility::split('.', ROSE_PACKAGE_VERSION);

for (size_t i=0; i < needParts.size && i < haveParts.size; ++i) { if (needParts[i] != haveParts[i]) return needParts[i] < haveParts[i]; }   // E.g., need = "1.2" and have = "1.2.x", or vice versa return true; }

Why can't ROSE staff members answer all my questions?
It can feel very frustrating when you get no responses to your questions submitted to the rose-public@nersc.gov mailing list. You may wonder why the ROSE staff cannot help neither sometimes.

Here are some possible excuses:
 * They are just as busy as everybody else in the research and development fields. They may be working around the clock to meet deadlines for proposals, papers, project reviews, deliverables, etc.
 * They don't know every corner of their own compiler, given the breadth and depth of contributions made to ROSE by collaborators, former staff members, post-docs, and interns. Moreover, most contributions lack good documentation--something that should be remedied in the future.
 * Some questions are simply difficult and open research and development questions. They may have no clue, either.
 * They just feel lazy sometimes or are taking a thing called vacation.

Possible alternatives to have your questions answered and your problems solved in a timely fashion:
 * Please do you own homework first (e.g. Google).
 * The ROSE team is actively addressing the documentation problem, through an internal code review process to enforce well-documented contributions going forward.
 * Help others to help yourself. Answer questions on the rose-public@nersc.gov mailing list and contribute to this community-editable Wikibook.
 * Find ways to formally collaborate with, or fund, the ROSE team. Things go faster when money is flowing :-) Sad, but true, reality in this busy world.

How many lines of source code does ROSE have?
Excluding the EDG submodule and all source code comments, the core of ROSE (rose/src) has about 674,000 lines of C/C++ source code as of July 11, 2012.

Including tests, projects, and tutorial directories, ROSE has about 2 Million lines of code.

Some details are shown below: [rose/src]./cloc-1.56.pl. 3076 text files. 2871 unique files. 716 files ignored.

http://cloc.sourceforge.net v 1.56 T=26.0 s (91.7 files/s, 39573.3 lines/s) --- Language                    files          blank        comment           code --- C++                           908          75280          93960         354636 C                             123          12010           3717         199087 C/C++ Header                  915          28302          38412         121373 Bourne Shell                   17           3346           4347          25326 Perl                            4            743           1078           7888 Java                           18           1999           4517           7096 m4                              1            747             20           6489 Python                         34           1984           1174           5363 make                          148           1682           1071           3666 C#                             11            899            274           2546 SQL                             1              0              0           1817 Pascal                          5            650             31           1779 CMake                         168           1748           4880           1702 yacc                            3            352            186           1544 Visual Basic                    6            228            421           1180 Ruby                           11            281            181            809 Teamcenter def                  3              3              0            606 lex                             2            103             47            331 CSS                             1             95             32            314 Fortran 90                      1             34              6            244 Tcl/Tk                          2             29              6            212 HTML                            1              8              0             15 --- SUM:                         2383         130523         154360         744023 ---

How large is ROSE?
To show top level information only (in MB): du -msl * | sort -nr 170	tests 109	projects 90	src 19	docs 16	winspecific 16	ROSE_ResearchPapers 15	binaries 7	scripts 5	LicenseInformation 4	tutorial 4	autom4te.cache 2	libltdl 2	exampleTranslators 2	configure 2	config 2	ChangeLog

Sort directories by their sizes in MegaBytes du -m | sort -nr >~/size.txt 709	. 250	./.git 245	./.git/objects 243	./.git/objects/pack 170	./tests 109	./projects 90	./src 76	./tests/CompileTests 50	./tests/RunTests 40	./tests/RunTests/FortranTests 34	./tests/RunTests/FortranTests/LANL_POP 29	./tests/RunTests/FortranTests/LANL_POP/netcdf-4.1.1 27	./src/3rdPartyLibraries 23	./tests/roseTests 23	./src/frontend 22	./tests/CompileTests/Fortran_tests 21	./tests/CompilerOptionsTests 19	./docs 18	./tests/CompileTests/RoseExample_tests 18	./src/midend 18	./docs/Rose 16	./winspecific 16	./ROSE_ResearchPapers 15	./tests/CompileTests/Fortran_tests/gfortranTestSuite 15	./binaries/samples 15	./binaries 14	./tests/CompileTests/Fortran_tests/gfortranTestSuite/gfortran.dg 14	./src/roseExtensions 11	./projects/traceAnalysis 10	./tests/CompileTests/A++Code 10	./tests/CompilerOptionsTests/testCpreprocessorOption 10	./tests/CompilerOptionsTests/A++Code 10	./src/roseExtensions/qtWidgets 10	./src/frontend/Disassemblers 10	./projects/symbolicAnalysisFramework 10	./projects/SATIrE 10	./projects/compass 9	./winspecific/MSVS_ROSE 9	./tests/RunTests/A++Tests 9	./tests/roseTests/binaryTests 9	./src/frontend/SageIII 9	./projects/symbolicAnalysisFramework/src 9	./docs/Rose/powerpoints 8	./winspecific/MSVS_project_ROSETTA_empty 8	./projects/simulator 7	./tests/RunTests/FortranTests/LANL_POP_OLD 7	./tests/CompileTests/Cxx_tests 7	./src/midend/programTransformation 7	./src/midend/programAnalysis 7	./src/3rdPartyLibraries/libharu-2.1.0 7	./scripts 7	./projects/symbolicAnalysisFramework/src/mpiAnal 7	./projects/RTC 6	./winspecific/MSVS_ROSE/Debug 6	./tests/RunTests/FortranTests/LANL_POP/netcdf-4.1.1/ncdap_test 6	./tests/roseTests/programAnalysisTests 6	./src/3rdPartyLibraries/ckpt 6	./src/3rdPartyLibraries/antlr-jars 6	./projects/SATIrE/src 5	./tests/RunTests/FortranTests/LANL_POP/pop-distro 5	./tests/RunTests/FortranTests/LANL_POP/netcdf-4.1.1/libcf 5	./tests/CompileTests/ElsaTestCases 5	./src/ROSETTA 5	./src/3rdPartyLibraries/qrose 5	./projects/DatalogAnalysis 5	./projects/backstroke 5	./LicenseInformation 5	./docs/Rose/AstProcessing

To list files based on size find. -type f -print0 | xargs -0 ls -s | sort -k1,1rn 241568 ./.git/objects/pack/pack-f366503d291fc33cb201781e641d688390e7f309.pack 13484 ./tests/CompileTests/RoseExample_tests/Cxx_Grammar.h 10240 ./projects/traceAnalysis/vmp-hw-part.trace 6324 ./tests/RunTests/FortranTests/LANL_POP_OLD/poptest.tgz 5828 ./winspecific/MSVS_ROSE/Debug/MSVS_ROSETTA.pdb 4732 ./.git/objects/pack/pack-f366503d291fc33cb201781e641d688390e7f309.idx 4488 ./binaries/samples/bgl-helloworld-mpicc 4488 ./binaries/samples/bgl-helloworld-mpixlc 4080 ./LicenseInformation/edison_group.pdf 3968 ./projects/RTC/tags 3952 ./src/frontend/Disassemblers/x86-InstructionSetReference-NZ.pdf 3908 ./tests/CompileTests/RoseExample_tests/trial_Cxx_Grammar.C 3572 ./winspecific/MSVS_project_ROSETTA_empty/MSVS_project_ROSETTA_empty.ncb 3424 ./src/frontend/Disassemblers/x86-InstructionSetReference-AM.pdf 2868 ./.git/index 2864 ./projects/compassDistribution/COMPASS_SUBMIT.tar.gz 2864 ./projects/COMPASS_SUBMIT.tar.gz 2740 ./ROSE_ResearchPapers/2007-CommunicatingSoftwareArchitectureUsingAUnifiedSingle-ViewVisualization-ICECC S.pdf 2592 ./docs/Rose/powerpoints/rose_compiler_users.pptx 2428 ./src/3rdPartyLibraries/ckpt/wrapckpt.c 2408 ./projects/DatalogAnalysis/jars/weka.jar 2220 ./scripts/graph.tar 1900 ./src/3rdPartyLibraries/antlr-jars/antlr-3.3-complete.jar 1884 ./src/3rdPartyLibraries/antlr-jars/antlr-3.2.jar 1848 ./src/midend/programTransformation/ompLowering/run_me_defs.inc 1772 ./src/3rdPartyLibraries/qrose/docs/QROSE.pdf 1732 ./tests/CompileTests/Cxx_tests/longFile.C 1724 ./src/midend/programTransformation/ompLowering/run_me_task_defs.inc 1656 ./ChangeLog 1548 ./tests/roseTests/binaryTests/yicesSemanticsExe.ans 1548 ./tests/roseTests/binaryTests/yicesSemanticsLib.ans 1480 ./ROSE_ResearchPapers/1997-ExpressionTemplatePerformanceIssues-IPPS.pdf 1408 ./docs/Rose/powerpoints/ExaCT_AllHands_March2012_ROSE.pptx

...

Cannot download the EDG binary tar ball
Three possible reasons
 * the website hosting EDG binaries is down (there is a manual way to get the binary)
 * we don't support the platform you use so there is no EDG binary is available for you.
 * you cloned your rose from an un-official repo so the build process cannot figure out the right version of EDG binary for you. (there is a solution mentioned below)

It is possible that the rosecompiler.org website is down for maintenance.

So you may encounter the following error message:

make[3]: Entering directory `/home/leo/workspace/github-rose/buildtree/src/frontend/CxxFrontend' test -d /nfs/casc/overture/ROSE/git/ROSE_EDG_Binaries && cp /nfs/casc/overture/ROSE/git/ROSE_EDG_Binaries/roseBinaryEDG-3-3-i686-pc-linux-gnu-GNU-4.4-32fe4e698c2e4a90dba3ee5533951d4c.tar.gz. || wget http://www.rosecompiler.org/edg_binaries/roseBinaryEDG-3-3-i686-pc-linux-gnu-GNU-4.4-32fe4e698c2e4a90dba3ee5533951d4c.tar.gz --2012-08-05 12:58:29-- http://www.rosecompiler.org/edg_binaries/roseBinaryEDG-3-3-i686-pc-linux-gnu-GNU-4.4-32fe4e698c2e4a90dba3ee5533951d4c.tar.gz Resolving www.rosecompiler.org... 128.55.6.204 Connecting to www.rosecompiler.org|128.55.6.204|:80... failed: No route to host. make[3]: *** [roseBinaryEDG-3-3-i686-pc-linux-gnu-GNU-4.4-32fe4e698c2e4a90dba3ee5533951d4c.tar.gz] Error 4

In this case, you should ask for the missing tar ball or find it on our backup location You don't have to clone the entire edge binary repo since it is big. You can just download the one you need (click raw file link on github.com).
 * https://github.com/rose-compiler/edg-binaries

Once you get the bar ball, copy it to your build tree's CxxFrontend subdirectory:
 * buildtree/src/frontend/CxxFrontend

Then you should be able to normally build rose by typing make.

TODO: automate the search using the alternative path to obtain edg binary

Another possible reason is that you cloned your local rose repo from an unofficial repository.
 * In order to maintain the correct matching between rose source and EDG binary, we require a canonical repository to be available.

make[3]: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend/CxxFrontend/Clang' Unable to find a remote tracking a canonical repository. Please add a canonical repository as a remote and ensure it is up to date. Currently configured remotes are:

origin => git@xxx.com/myrose.git

Potential canonical repositories include:

anything ending with "rose.git" (case insensitive) Unable to find a remote tracking a canonical repository. Please add a canonical repository as a remote and ensure it is up to date. Currently configured remotes are:

origin => git@xxx.com/myrose.git

Potential canonical repositories include:

anything ending with "rose.git" (case insensitive) make[3]: Entering directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend/CxxFrontend' test -d /nfs/casc/overture/ROSE/git/ROSE_EDG_Binaries && cp /nfs/casc/overture/ROSE/git/ROSE_EDG_Binaries/roseBinaryEDG-3-3-x86_64-pc-linux-gnu-GNU-4.3-.tar.gz. || wget http://www.rosecompiler.org/edg_binaries/roseBinaryEDG-3-3-x86_64-pc-linux-gnu-GNU-4.3-.tar.gz --2013-02-15 17:26:42-- http://www.rosecompiler.org/edg_binaries/roseBinaryEDG-3-3-x86_64-pc-linux-gnu-GNU-4.3-.tar.gz Resolving www.rosecompiler.org... 128.55.6.204 Connecting to www.rosecompiler.org|128.55.6.204|:80... connected. HTTP request sent, awaiting response... 404 Not Found 2013-02-15 17:26:42 ERROR 404: Not Found.

make[3]: *** [roseBinaryEDG-3-3-x86_64-pc-linux-gnu-GNU-4.3-.tar.gz] Error 1 make[3]: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend/CxxFrontend' make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend/CxxFrontend' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src/frontend' make: *** [all-recursive] Error 1 make: Leaving directory `/global/project/projectdirs/rosecompiler/rose-project-workspace/xomp-instr/buildtree/src'

Solution: add an official rose repo as an additional remote repo of your local repo
 * add a canonical repository, like the one at github: git add remote official-rose https://github.com/rose-compiler/rose.git
 * git fetch official-rose // to retrieve hash numbers etc in the canonical repository
 * Now you can build rose again. it should find the canonical repo you just added and use it to find a matching EDG binary

How to access EDG or EDG-SAGE connection code?
From page 5 of http://rosecompiler.org/ROSE_UserManual/ROSE-UserManual.pdf

The connection code that was used to translate EDG’s AST to SAGE III was derived loosely from the EDG C++ source generator and has formed the basis of the SAGE III translator from EDG to SAGE III’s IR.

Under the license we have, the EDG source code and the translation from the EDG AST in distributions are excluded from source release and are made available through a binary format. No part of the EDG work is visible to the user of ROSE. The EDG source are available only to those who have the EDG research or commercial license.

Chapter 2.6 "Getting a Free EDG License for Research Use" of the manual has instructions about how to obtain the EDG license.

Once you obtain the license, please contact the staff members of ROSE to verify your license. After that, they will give you more instructions about how to proceed.

How to speedup compiling ROSE?
Question It takes hours to compile ROSE, how can I speed up this process?

Answer:
 * if you have multi-core processors, try to use make -j4 (make by using four processes or even more if you like).
 * also try to only build librose.so under src/ by typing make -C src/ -j4
 * Or only try to build the language support you are interested in during configure, such as
 * ../sourcetree/configure --enable-only-c # if you are only interested in C/C++ support
 * ../sourcetree/configure --enable-only-fortran # if you are only interested in Fortran support
 * ../sourcetree/configure --help # show all other options to enable only a few languages.

Can ROSE accept incomplete code?
https://mailman.nersc.gov/pipermail/rose-public/2011-July/001015.html

ROSE does not handle incomplete code. Though this might be possible in the future. It would be language dependent and likely depend heavily on some of the language specific tools that we use internally. This is however, not really a priority for our work. If you want to for example demonstrate how some of the internal tools we are using or alternative tools that we could use might handle incomplete code, this might be interesting and we could discuss it.

For example, we are not presently using Clang, but if it handled incomplete code that might be interesting for the future. I recall that some of the latest EDG work might handle some incomplete code, and if that is true then that might be interesting as well. I have not attempted to handle incomplete code with OFP, so I am not sure how well that could be expected to work. Similarly, I don't know what the incomplete code handling capabilities of ECJ Java support is either. If you know any of these questions we could discuss this further.

I have some doubts about how much meaningful information can come from incomplete code analysis and so that would worry me a bit. I expect it is very language dependent and there would be likely some constraints on the incomplete code. So understanding the subject better would be an additional requirement for me.

Can ROSE analyze Linux Kernel sources?
https://mailman.nersc.gov/pipermail/rose-public/2011-April/000856.html

Question: I'm trying to analyze the Linux kernel. I was not sure of the size of the code-base that can be handled by ROSE, and could not find references as to whether it has been tried on the Linux kernel source. As of now I'm trying to run the identity translator on the source, and would like to know if it can be done using ROSE, and if it has been successfully tested before.

Short answer: Not for now

Long answer: We are using EDG 3.3 internally by default and this version of EDG does not handle the GNU specific register modifiers used in the asm statements of the Linux Kernel code. There might be other problems, but that was at least the one that we noticed in previous work on this some time ago. But we are working on upgrading the EDG frontend to be a more recent version 4.4.

Can ROSE compile C++ Boost library?
https://mailman.nersc.gov/pipermail/rose-public/2010-November/000544.html

not yet.

I know of a few cases where ROSE can't handle parts of Boost. In each case it is an EDG problem where we are using an older version of EDG. We are trying to upgrade to a newer version of EDG (4.x), but that version's use within ROSE does not include enough C++ support, so it is not ready. The C support is internally tested, but we need more time to work on this.

How to find XYZ in AST?
The usually steps to retrieve information from AST are: Some sample AST graphs are available at https://github.com/chunhualiao/rose-ast
 * prepare a simplest (preferably 5-10 lines only), compilable sample code with the code feature you want to find (e.g array[i][j] if you are curious about how to find use of multi-dimensional arrays in AST), avoid including any headers (#include file.h) to keep the code small.
 * Please note: don't include any headers in the sample code. A header (#include  for example) can bring in thousands of nodes into AST.
 * use dotGeneratorWholeASTGraph to generate a detailed AST dot graph of the input code
 * use zgrviewer-0.8.2's run.sh to visualize the dot graph
 * visually/manually locate the information you want in the dot graph, understand what to look and where to look

How to get children of an AST node?
Once you know how to find a child in the AST manually. You can use codes to walk the AST using AST member functions, traversal, or SageInteface functions, etc to retrieve the information you want
 * ROSE provides member access functions like get_X by default for a child named X. such as get_lhs_operand for SgBinaryOp with a child named lhs_operand in the AST graph.
 * The names are shown in AST graph as labels of edges from parents to children.

To get a child by index use the function (not recommended though): virtual SgNode * 	get_traversalSuccessorByIndex (size_t idx) and/or related, similarly named functions.

How to filter out header files from AST traversals?
https://mailman.nersc.gov/pipermail/rose-public/2010-April/000144.html

Question: I want to exclude functions in #include files from my analysis/transformations during my processing.

By default, AST traversal may visit all AST nodes, including the ones come from headers.

So AST processing classes provide three functions :
 * T traverse (SgNode * node, ..): traverse full AST, nodes which represent code from include files
 * T traverseInputFiles(SgProject* projectNode,..) traverse the subtree of AST which represents the files specified on the command line
 * T traverseWithinFile(SgNode* node,..): only the nodes which represent code of the same file as the start node

Should SgIfStmt::get_true_body return SgBasicBlock?
https://mailman.nersc.gov/pipermail/rose-public/2011-April/000930.html

Both true/false bodies were SgBasicBlock before.

Later, we decided to have more faithful representation of both blocked (with {...}) and single-statement (without { ..} ) bodies. So they are SgStatement (SgBasicBlock is a subclass of SgStatement) now.

But it seems like the document has not been updated to be consistent with the change.

You have to check if the body is a block or a single statement in your code. Or you can use the following function to ensure all bodies must be SgBasicBlock.

//A wrapper of all ensureBasicBlockAs* above to ensure the parent of s is a scope statement with list of statements as children, otherwise generate a SgBasicBlock in between.

SgLocatedNode * 	SageInterface::ensureBasicBlockAsParent (SgStatement *s)

How to handle #include "header.h", #if, #define etc. ?
It is called preprocessing info. within ROSE's AST. They are attached before, after, or within a nearby AST node (only the one with source location information.)

An example translator is provided to traverse the input code's AST and dump information about the found preprocessing information. The source code of this translator is https://github.com/rose-compiler/rose/blob/master/exampleTranslators/defaultTranslator/preprocessingInfoDumper.C.

To use the translator: buildtree/exampleTranslators/defaultTranslator/preprocessingInfoDumper -c main.cxx --- Found an IR node with preprocessing Info attached: (memory address: 0x2b7e1852c7d0 Sage type: SgFunctionDeclaration) in file /export/tmp.liao6/workspace/userSupport/main.cxx (line 3 column 1) -PreprocessingInfo #0 --- : classification = CpreprocessorIncludeDeclaration: String format = #include "all_headers.h"

relative position is = before

SgClassDeclaration::get_definition returns NULL?
If you look at the whole AST graph carefully, you can find defining and non-defining declarations for the same class.

A symbol is usually associated with a non-defining declaration. A class definition is associated with a defining declaration.

You may want to get the defining declaration from the non-defining declaration before you try to grab the definition, as in this function: SgFunctionDefinition* getFunctionDefinitionFromDeclaration(const SgFunctionDeclaration* funcDecl) { //Get the defining declaration (we don't know if funcDecl is the defining or nonDefining declaration SgFunctionDeclaration* funcDefDecl = isSgFunctionDeclaration(funcDecl->get_definingDeclaration);   ROSE_ASSERT(funcDefDecl != NULL);

//Get the definition from the defining declaration SgFunctionDefinition* funcDef = isSgFunctionDefinition(funcDefDecl->get_definition); ROSE_ASSERT(funcDef != NULL); return funcDef; }

How to handle arrays?
The first step is to get familiar with the AST representing Array types (SgArrayType) and array references (SgPntrArrRefExp). Then you can retrieve the necessary information from the AST.

To understand array types and array references, Here is one example,

// cat ~/temp/array.c int a[5][10][15]; // array declaration, a type is declared int foo { return a[0][1][2]; // a reference to array element }

An Array Type is represented by SgArrayType.

int a[5][10][15], corresponding three SgArrayType linked together

List a->get_type will return the first one So a traverse from the first to the element type will get all dimension sizes 5-10-15
 * SgArrayType_1: (index=5, base_type = SgArrayType_2)
 * SgArrayType_2: (index=10, base_type = SgArrayType_3)
 * SgArrayType_3: (index=15, base_type = SgTypeInt )

The subtree looks like

SgArrayType_1 /      \     5      SgArrayType_2 /      \           10      SgArrayType_3 /   \                    15     SgTypeInt

An array reference is represented by SgPntrArrRefExp

A reference like: a[0][1][2]
 * SgPntrArrRefExp_1 
 * SgPntrArrRefExp_2 
 * SgPntrArrRefExp_3 

The subtree should look like the following: a[0][1][2] //SgPntrArrRefExp /   \  a[0][1]  2 // SgIntVal / \ a[0] 1 / \ a   0

SgVarRefExp

There are quite a few functions related to array handling in http://rosecompiler.org/ROSE_HTML_Reference/namespaceSageInterface.html

You can just search "array" to find them:

//Check if an expression is an array access (SgPntrArrRefExp). If so, return its name expression and subscripts if requested. Users can use convertRefToInitializedName to get the possible name. It does not check if the expression is a top level SgPntrArrRefExp.

SageInterface::isArrayReference (SgExpression *ref, SgExpression **arrayNameExp=NULL, std::vector< SgExpression * > **subscripts=NULL)

// 	returns the array dimensions in an array as defined for arrtype std::vector< SgExpression * > 	SageInterface::get_C_array_dimensions (const SgArrayType &arrtype)

// 	Get the number of dimensions of an array type. int 	SageInterface::getDimensionCount (SgType *t)

// 	Get the element type of an array. SgType * 	SageInterface::getArrayElementType (SgType *t)

Some example code using these functions can be found in https://github.com/rose-compiler/rose-develop/blob/master/src/midend/programTransformation/ompLowering/omp_lowering.cpp

For example, void linearizeArrayAccess(SgPntrArrRefExp* top_array_ref) rewrites array reference using multiple-dimension subscripts to a reference using one-dimension subscripts:
 * a[i][j] is changed to a[i*col_size +j]
 * a [i][j][k] is changed to a [(i*col_size + j)*K_size +k]

Sample code to handle 1-D array references For 1-D array element access a[0], the AST with 3 nodes looks like:

a[0]         // node 1: SgPntrArrRefExp /   \ a       0    //node 3:  SgIntVal // node 2: SgVarRefExp

So the code searching for SgVarRefExp will find a. The next step is to check its type.

SgVarRefExp *vref = ... ROSE_ASSERT (vref != NULL);

SgType* t = vref->get_type;

if (SgArrayType* atype= isSgArrayType(t)) // now you have array type {   // obtain the dimension vector vector dimensions =  SageInterface::get_C_array_dimensions (* atype); // dimensions.size should be 1 if you only handle 1-D array types if (dimensions.size ==1) {     SgPntrArrRefExp * arr_ref_exp = vref->get_parent; // now you get a[0] from a.      //do your things you want, with a (vref) and a[o] (arr_ref_exp)

} }   else if (SageInterface::isScalarType(t))// if scalar types, handle them differently {    ...   }

How to add new AST nodes?
There is a section named "1.7 Adding New SAGE III IR Nodes (Developers Only)" in ROSE Developer’s Guide (http://www.rosecompiler.org/ROSE_DeveloperInstructions.pdf)

But before you decide adding new nodes, you may consider if AstAttribute (user defined objects attached to AST) would be sufficient for your problem.

For example, the 1st version of the OpenMP implementation in ROSE (rose/projects/OpenMP_Translator) started by using AstAttribute to represent information parsed from pragmas. Only in the 2nd version we introduced dedicated AST nodes.

There are two separate steps when new kinds of IR nodes are added into ROSE:
 * First step (declaration): Adding class declaration/implementation into ROSE for the new IR nodes. This step is mostly related to ROSETTA.
 * Second step (creation): Creating those new IR nodes at some point: such as somewhere within frontend, midend, or even backend if desired. So this step is decided case by case.

If the new types of IR come from their counterparts in EDG, then modifications to the EDG/SAGE connection code are needed. If not, the EDG/SAGE connection code may be irrelevant.

If you are trying to add new nodes to represent pragma information, you can create your new nodes without involving EDG or its connection to ROSE. You just parse the pragma string in the original AST and create your own nodes to get a new version of AST. Then it should be done.

How does the AST merge work?
tests that demonstrate the AST Merge are in the directory: tests/nonsmoke/functional/CompileTests/mergeAST_tests (run "make check" to see hundreds of tests go by).

parent vs. scope
An AST node can have a parent node which is different from the its scope.

For example: the struct declaration's parent is the typedef declaration. But the struct's scope is the scope of the typedef declaration.

typedef struct frame {int x;} s_frame;

Parsing text into AST
There is some experimental support to parse simple code text into AST pieces. It is not intended to parse entire source codes. But the support should be able to be extended to handle more types of input.

Some documentation about this work:
 * http://rosecompiler.org/ROSE_HTML_Reference/namespaceAstFromString.html
 * http://rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf Chapter 33 Parser Building Blocks

Example project using the parser building blocks
 * projects/pragmaParsing should work.

How to skip system headers in translation?
Often we are only interested in user code. The AST represents all codes from users and system headers. We need to skip things from system headers.

// Final most complete version, skip all header files, we cannot unparse changed AST from header files, at least by default

if (Inliner::skipHeaders) {      string filename= funcall->get_file_info->get_filename; string suffix = StringUtility ::fileNameSuffix(filename); //vector.tcc: This is an internal header file, included by other library headers if (suffix=="h" ||suffix=="hpp"|| suffix=="hh"||suffix=="H" ||suffix=="hxx"||suffix=="h++" ||suffix=="tcc") return false;

// also check if it is compiler generated, mostly template instantiations. They are not from user code. if (funcall->get_file_info->isCompilerGenerated ) return false;

// check if the file is within include-staging/ header directories if (insideSystemHeader(funcall)) return false;

}

//partial solutions

bool processStatements(SgNode* n) { ROSE_ASSERT (n!=NULL); // Skip compiler generated code, system headers, etc. if (isSgLocatedNode(n)) {   if (isSgLocatedNode(n)->get_file_info->isCompilerGenerated) return false; } ... }

This is based on Sg_File_Info

Inside of Sg_File_Info::display(debug.......) isTransformation                     = false isCompilerGenerated                  = true (no position information) isOutputInCodeGeneration             = false isShared                             = false isFrontendSpecific                   = true (part of ROSE support for gnu compatability) isSourcePositionUnavailableInFrontend = false isCommentOrDirective                 = false isToken                              = false file_id = 2 filename = /home/liao6/daily-test-rose/upcwork/install/include/gcc_HEADERS/rose_edg_required_macros_and_functions.h     line     = 167  column   = 1

.... shared[1] int gsj; Inside of Sg_File_Info::display(debug.......) isTransformation                     = false isCompilerGenerated                  = false isOutputInCodeGeneration             = false isShared                             = false isFrontendSpecific                   = false isSourcePositionUnavailableInFrontend = false isCommentOrDirective                 = false isToken                              = false filename = /home/liao6/svnrepos/mycode/rose/upc/unshared.upc line    = 6  column = 1 file_id = 1 filename = /home/liao6/svnrepos/mycode/rose/upc/unshared.upc line    = 6  column   = 1

Another way, rose make a copy for all system headers and store them in dedicated paths

bool insideSystemHeader (SgLocatedNode* node) {   bool rtval = false; ROSE_ASSERT (node != NULL); Sg_File_Info* finfo = node->get_file_info; if (finfo!=NULL) {     string fname = finfo->get_filenameString; string buildtree_str1 = string("include-staging/gcc_HEADERS"); string buildtree_str2 = string("include-staging/g++_HEADERS"); string installtree_str1 = string("include/edg/gcc_HEADERS"); string installtree_str2 = string("include/edg/g++_HEADERS"); // if the file name has a sys header path of either source or build tree if ((fname.find (buildtree_str1, 0) != string::npos) ||         (fname.find (buildtree_str2, 0) != string::npos) ||          (fname.find (installtree_str1, 0) != string::npos) ||          (fname.find (installtree_str2, 0) != string::npos)          ) rtval = true; }   return rtval; }

Can ROSE identityTranslator generate 100% identical output file?
https://mailman.nersc.gov/pipermail/rose-public/2011-January/000604.html

Questions: Rose identityTranslator performs some modifications, "automatically".

These modifications are: Can I avoid these modifications?
 * Expanding the assert macro.
 * Adding extra brackets around constants of typedef types (e.g. c=Typedef_Example(12); is translated in the output to c = Typedef_Example((12));)
 * Converting NULL to 0.

Answer: No.

There is no easy way to avoid these changes currently. Some of them are introduced by the cpp preprocessor. Others are introduced by the EDG front end ROSE uses. 100% faithful source-to-source translation may require significant changes to preprocessing directive handling and the EDG internals.

We have had some internal discussion to save raw token strings into AST and use them to get faithful unparsed code. But this effort is still at its initial stage as far as I know.

How to build a tool inserting function calls?
https://mailman.nersc.gov/pipermail/rose-public/2010-July/000319.html

Question: I am trying to build a tool which insert one or more function calls whenever in the source code there is a function belonging to a certain group (e.g. all functions beginning with foo_*). During the ast traversal, how can I find the right place, i.e., there is a function in ROSE that searches for a string pattern or something similar?

Answers:
 * In Chapter 28 AST Construction of the ROSE tutorial, there are examples to instrument function calls into the AST using traversals or a queryTree. I would approach this by checking the node for the specific SgFunctionDefinition (or whatever you need) and then check the name of the node to find its location.
 * You can
 * use the AST query mechanism to find all functions and store them in a container. e.g Rose_STL_Container nodeList = NodeQuery::querySubTree(root_node,V_Sg????);
 * Then iterate the container to check each function to see if the function name matches what you want.
 * use SageBuilder namespace's buildFunctionCallStmt to create a function call statement.
 * use SageInterface namespace's insertStatement to do the insertion.

How to insert a header into an input file?
There is an SageInterface function for doing this: // Insert include "filename" or include (system header) into the global scope containing the current scope, right after other include XXX. PreprocessingInfo *    SageInterface::insertHeader (const std::string &filename, PreprocessingInfo::RelativePositionType position=PreprocessingInfo::after, bool isSystemHeader=false, SgScopeStatement *scope=NULL)

How to copy/clone a function?
https://mailman.nersc.gov/pipermail/rose-public/2011-April/000919.html

We need to be more specific about the function you want to copy. Is it just a prototype function declaration (non-defining declaration in ROSE's term ) or a function with a definition (defining declaration in ROSE's term)? // Build a prototype for an existing function declaration (defining or nondefining is fine). SgFunctionDeclaration* SageBuilder::buildNondefiningFunctionDeclaration (const SgFunctionDeclaration *funcdecl, SgScopeStatement *scope=NULL)
 * Copying a non-defining function declaration can be achieved by using the following function:

It is at least a hack to first introduce something wrong and later correct it. Here is an example translator to do the hack (copy a defining function, rename it, fix its symbol):
 * Copying a defining function declaration is semantically a problem since it introduces redefinition of the same function.

using namespace SageInterface;
 * 1) include 
 * 2) include 

int main(int argc, char** argv) { SgProject* project = frontend(argc, argv); AstTests::runAllTests(project);

// Find a defining function named "bar" under project

SgFunctionDeclaration* func= findDeclarationStatement (project, "bar", NULL, true); ROSE_ASSERT (func != NULL);

// Make a copy and set it to a new name SgFunctionDeclaration* func_copy = isSgFunctionDeclaration(copyStatement (func)); func_copy->set_name("bar_copy");

// Insert it to a scope SgGlobal * glb = getFirstGlobalScope(project); appendStatement (func_copy,glb);

SgFunctionSymbol *func_symbol = glb->lookup_function_symbol ("bar_copy", func_copy->get_type); if (func_symbol == NULL); {   func_symbol = new SgFunctionSymbol (func_copy); glb ->insert_symbol("bar_copy", func_symbol); } AstTests::runAllTests(project); backend(project); return 0; }
 * 1) if 0 // fix up the missing symbol, this should be optional now since SageInterface::appendStatement should handle it transparently.
 * 1) endif


 * Another thing to consider is if you want to copy a function into another file. You have to change the clone's file location information.
 * Original post: https://mailman.nersc.gov/pipermail/rose-public/2013-April/002173.html

ROSE's unparser checks for Sg_File_Info objects of AST pieces before it decides to print out text format of the AST pieces. Only the AST coming from the same file of the input file or AST generated by transformation should be unparsed by default. For example, some AST subtrees come from an included header. But it is often not desired to unparse the content of an included header.

If the file info is still the original file info, the solution is to set the copied AST to be transformation-generated:

// Recursively set source position info(Sg_File_Info) as transformation generated. SageInterface::setSourcePositionForTransformation (SgNode *root)

Can I transform code within a header file?
https://mailman.nersc.gov/pipermail/rose-public/2011-May/000971.html

No. ROSE does not unparse AST from headers right now. A summer project tried to do this. But it did not finish and not well tested.

The option is -rose:unparseHeaderFiles -rose:unparseHeaderFilesRootFolder UNPARSED_HEADERS_DIR in tests/CompilerTests/UnparseHeadersTests

https://mailman.nersc.gov/pipermail/rose-public/2010-August/000344.html

I guess ROSE does not support writing out changed headers for safety/practical reasons. A changed header has to be saved to another file since writing to the original header is very dangerous (imaging debugging a header translator which corrupts input headers). Then all other files/headers using the changed header have to be updated to use the new header file.

Also all files involved have to be writable by user's translators.

As a result, the current unparser skips subtrees of AST from headers by checking file flags (compiler_generated and/or output_in_code_generation etc.) stored in Sg_File_Info objects.

How to work with formal and actual arguments of functions?
https://mailman.nersc.gov/pipermail/rose-public/2011-June/001008.html //Get the actual arguments SgExprListExp* actualArguments = NULL; if (isSgFunctionCallExp(callSite)) actualArguments = isSgFunctionCallExp(callSite)->get_args; else if (isSgConstructorInitializer(callSite)) actualArguments = isSgConstructorInitializer(callSite)->get_args; ROSE_ASSERT(actualArguments != NULL);

const SgExpressionPtrList& actualArgList = actualArguments->get_expressions;

//Get the formal arguments. SgInitializedNamePtrList formalArgList; if (calleeDef != NULL) formalArgList = calleeDef->get_declaration->get_args;

//The number of actual arguments can be less than the number of formal arguments (with implicit arguments) or greater //than the number of formal arguments (with varargs)

How to translate multiple files scattered in different directories of a project?
Expected behavior of a ROSE Translator:

A translator built using ROSE is designed to act like a compiler (gcc, g++,gfortran ,etc depending on the input file types). So users of the translator only need to change the build system for the input files to use the translator instead of the original compiler.

If the original compiler used by you implicitly include or link anything, you may have to make the include or linking paths explicit after the change. For example, if mpiCC transparently links to /path/to/mpilib.a, you have to add this linking flag into your modified Makefile. On 07/25/2012 11:20 AM, Fernando Rannou wrote: > > Hello > > > > We are trying to use ROSE to refactor a big project consisting of > > several  *.cc and *.hh files, located at various directories. Each > > class is defined in a *.hh file and implemented in a *.cc file. > > Classes include (#include) other class definitions. But we have only > > found single file examples. > > > > Is this possible? If so, how? > > > > > > Thanks

Generate code into different files
https://mailman.nersc.gov/pipermail/rose-public/2012-August/001742.html Question: I wonder is it possible for ROSE to generate two files (.c and .cl) when it translates C-to-OpenCL ?

Answer: The ROSE outliner has an option to output the generated function into a new file.

https://github.com/rose-compiler/rose/blob/master/src/midend/programTransformation/astOutlining/Outliner.hh ... // Generate the outlined function into a separated new source file // -rose:outline:new_file extern bool useNewFile; ... You may want to check how this option is used in the outliner source files to get what you want.

How is the binary analysis capability in ROSE?
Question: how is the binary analysis capability in ROSE? Is it just disassembly? is it possible to associate the binary code with the source if combined with ROSE source code analysis?

Answer:

ROSE has various binary disassemblers (x86, ARM, MIPS, PowerPC) that, like source code analysis, create an internal representation of the binary in the form of an AST. Although the types of AST nodes for source and binaries are largely disjoint, one can analyze the binary AST using concepts similar to source analysis. ROSE has a few binary analyses. Here are some off the top of my head:
 * Control flow graphs, both virtual and using Boost Graph Library.
 * Function call graphs.
 * Operations on control flow graphs: dominator, post-dominator
 * Pointer detection analysis that tries to figure out which memory locations are used as if they were pointers in a higher level language.
 * Instruction partitioning: figuring out how to group instructions into basic blocks, and how to group basic blocks into functions when all you have is a list of instructions. Its accuracy on automatically partitioning stripped, obfuscated code has been shown to be better than the best disassemblers that use debugging info and symbol tables.
 * Instruction semantics for x86. This is an area of active development but supports only 32-bit integer instructions. We plan to add floating point, SIMD, 64-bit, other architectures, and a simpler API. But even as it stands, it is complete enough to simulate entire ELF executables (even "vi"). See next bullet
 * An x86 simulator for ELF executables. This project is able to simulate how the Linux kernel loads an executable, and the various system calls made by the executable. It it complete enough to simulate many Linux programs, but also provides callback points for the user to insert various kinds of analyses.  For instance, you could use it to disassemble an entire process after it has been dynamically linked.  There are many examples in the projects/simulator directory. In contrast to simulators like Qemu, Bochs,  valgrind, VirtualBox, VMware, etc. where speed is a primary design driver, the ROSE simulator is designed to provide user-level access to as many aspects of execution as possible.
 * Plugins for instruction semantics. Instruction semantics is written in such a way that different "semantic domains" can be plugged in. ROSE has a symbolic domain, an interval domain, and a partial-symbolic domain.  The symbolic domain can be used in conjunction with an SMT solver (currently supporting Yices). The interval domain is actually sets of intervals, and is binary-arithmetic-aware (i.e., correctly handles overflows, etc on a fixed word size).  The partial-symbolic domain uses single-node expressions in order to optimize for speed and size at the expense of accuracy.  Users can and have written other domains, and a new API (in the works) will make this even  easier.
 * Examples of data-flow analysis (e.g., the pointer analysis already mentioned), but not a well defined framework yet (someone is working on one). Currently, data-flow type analyses are implemented using the instruction semantics support: as each instruction is "executed" the domain in which it executes causes the data to flow in the machine state. Each analysis provides its own flow equation to handle the points where control flow joins from two or more directions; and provides its own "next-instruction" function to iterate over the control flow graph.
 * Clone detection of various formats: various forms of syntactic, including one using locality-sensitive hashing; and semantic clone detection via fuzz testing in a simulator.

By Robb

You ask about combining source and binary analysis... Its certainly possible since ROSE can hold both the binary and source ASTs in memory at the same time. But I'm not aware of any analysis that "sews" them together. We do support parsing DWARF info from ELF executables, so you might be able to use that to sew the two ASTs together.

--Robb

git clone returns error: SSL certificate problem?
Symptom: git clone https://github.com/rose-compiler/rose.git Cloning into rose... error: SSL certificate problem, verify that the CA cert is OK. Details: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed while accessing https://github.com/rose-compiler/rose.git/info/refs

fatal: HTTP request failed

The reason may be that you are behind a firewall which tweaks the original SSL certification.

Solutions: Tell cURL to not check for SSL certificates: $ env GIT_SSL_NO_VERIFY=true git pull
 * 1) Solution 1: Environment variable (temporary)

# set local configuration $ git config --local http.sslVerify false
 * 1) Solution 2: git-config (permanent)

$ git config --global http.sslVerify false
 * 1) Solution 2:  set global configuration

What is the best IDE for ROSE developers?
https://mailman.nersc.gov/pipermail/rose-public/2010-April/000115.html

There may not be a widely recognized best integrated development environment. But developers have reported that they are using The thing is that ROSE is huge and has some ridiculously large generated source file (CxxGrammar.h and CxxGrammar.C are generated in the build tree for example). So many code browsers may have trouble in handling ROSE.
 * vim
 * emacs
 * KDevelop
 * Source Navigator
 * Eclipse
 * Netbeans

What is the status for supporting Windows?
We do maintain some preliminary Windows Support of building ROSE/src to generate librose.so by leveraging cmake. However, the work is not finished.

To build librose under windows, type the following command lines in the top level source tree mkdir ROSE-build-cmake cd ROSE-build-cmake cmake .. -DBOOST_ROOT=${ROSE_TEST_BOOST_PATH} // Example: boost installation path /opt/boost_1_40_0-inst

https://mailman.nersc.gov/pipermail/rose-public/2011-December/001349.html

We have not finished the Windows work yet. IT is on our list of things to do. It was started and ROSE internally compiles using MS Visual Studio (using project files generated from the Cmake build that we maintain and test within our release process for ROSE) but does not pass our tests. So it is not ready. The distribution of the EDG binaries for Windows is another step that would come after that. We don't know at present when this will be done, it is important, but not a high priority for our DOE specific work, but important for other work. The effort required is something that we could discuss. If you want to call me that would be the best way to proceed. Send me email off of the main list and we can set that up.

https://mailman.nersc.gov/pipermail/rose-public/2011-March/000798.html

Under Windows ROSE uses CMake. This is a project that is currently under development. As of November 2010 we are able to compile and link the src directory. We are also able to run example programs that link against librose and execute the frontend and backend. {\em However, this is an internal capability and not available externally yet since we don't distribute the Windows generated EDG binaries that would be required. Also the current support for Windows is still incomplete, ROSE does not yet pass its internal tests under Windows.}