any C/C++ refactoring tool based on libclang? (even simplest "toy example" )
Do you know of any C/C++ refactoring tool based on libclang ?
"Any" includes even simple alpha state project, with support of one refactoristation technique. It can be without preprocessor support. As an example of the functionally about which I'm talking: changing method names, whether it supports multiple files or only one file at a time. You might be wondering what the goal is of asking for even small working examples My thought is that creating a list of code examples and small tools that are in one place will provide a better resource to learn how to implement refactorisation with libclang. I believe that from simple projects might grow bigger projects - in a proper opensource manner :).
Clang contains a library called "CIndex" which was developed, I believe, for doing code completion in IDEs. It can also be used for parsing C++ and walking the AST, but doesn't have anything in the way of refactoring. See Eli Bendersky's article here.
I have started such a project recently: cmonster. It's a Python-based API for parsing C++ (using libclang), analyzing the AST, with an interface for "rewriting" (i.e. inserting/removing/modifying source ranges). There's no nice way (yet) for doing things like modifying function names and having that translated into source-modifications, but it wouldn't be terribly difficult to do that.
I have not yet created a release with this functionality (although it's in the github repo), as I'm waiting for llvm/clang 3.0 to be released.
Also, I should point out a couple of things:
- The code is very rough, calling it alpha would be perhaps generous.
- I'm by no means an expert on this subject (unlike, say, Dr. Ira Baxter over there).
Adjust expectations appropriately.
Update: cmonster 0.2 has been released, which includes the described features. Check it out on Github.
Google have been working on a tooling library for Clang. In since the 3.2 release. It includes a ASTMatchers library so you can just build up a query and don't have to walk the AST.
There is a great video talk on the subject that walks through a simple rename example. (This is from the same guy as the MapReduce talk posted above but is newer and more about a simple practical implementation rather than the internal design and enterprise scale stuff Google have going on).
The source for that example that renames a method is available in the tooling branch. It may be somewhere in the trunk but I can't find it. Also Rename the getDeclAs function to getNodesAs as the other is apparently deprecated.). There is a more advanced example that removes duplicated c_str calls (which is in trunk and someone posted above).
EDIT: Google are now working on something called Clangd which aims to be some kind of Clang server for refactoring.
Google made a Clang based refactoring tool for their C++ code base and plans to release it. I don't know the current state of the project, but you can see this demo presented on the 2011 LLVM Developers Meeting: https://www.youtube.com/watch?v=mVbDzTM21BQ.
Also, XCode's (4+) built-in auto-completion and refactoring functions are based on libclang.
This may be a bit 'meta', but there's an example thats written in clang as a tool to run on clang (although, there's more to it than just that.
// This file implements a tool that prints replacements that remove redundant // calls of c_str() on strings. // // Usage: // remove-cstr-calls <cmake-output-dir> <file1> <file2> ... // // Where <cmake-output-dir> is a CMake build directory in which a file named // compile_commands.json exists (enable -DCMAKE_EXPORT_COMPILE_COMMANDS in // CMake to get this output). // // <file1> ... specify the paths of files in the CMake source tree. This path // is looked up in the compile command database. If the path of a file is // absolute, it needs to point into CMake's source tree. If the path is // relative, the current working directory needs to be in the CMake source // tree and the file must be in a subdirectory of the current working // directory. "./" prefixes in the relative files will be automatically // removed, but the rest of a relative path must be a suffix of a path in // the compile command line database. // // For example, to use remove-cstr-calls on all files in a subtree of the // source tree, use: // // /path/in/subtree $ find . -name '*.cpp'| // xargs remove-cstr-calls /path/to/source
https://github.com/lukhnos/refactorial is based on clang and claims
Accessor: Synthesize getters and setters for designated member variables
MethodMove: Move inlined member function bodies to the implementation file
ExtractParameter: promote a function variable to a parameter to that function
TypeRename: Rename types, including tag types (enum, struct, union, class), template classes, Objective-C types (class and protocol), typedefs and even bulit-in types (e.g. unsigned to uint32_t)
RecordFieldRename: Rename record (struct, union) fields, including C++ member variables
FunctionRename: Rename functions, including C++ member functions
Works via specifications in a YAML configuration file. I haven't tried it out (yet).
Not open source, but has been used to carry out very non-toy massive automated refactoring of C++ programs: our DMS Software Reengineering Toolkit. DMS is a "library" (we called it a "toolkit") of facilities on can compose to achieve anlaysis and/or automated translation.
Relevant to C++, DMS provides at this point in time:
- Full C++11 parser, constructing the AST and able to regenerate source code accurately including comments, with a complete preprocessor
- Full C++ parser with name and type resolution for C++ (ANSI, GNU, MS Visual C++)
- Control flow analysis for C++
- Source-to-source transformations
- Partially complete "rename" machinery (see discussion below)
What I can say from experience is that C++ is a bitch of a language to transform.
We continue to work on it, and are completing a reliable renaming tool. Even this is hard; a key problem is the name-shadowing problem. You have a local variable X, and a reference to Y inside that scope; you attempt to rename Y to X and discover that the local variable "captures" the access. It is amazing how many namespaces and capture types you have to worry about in C++. And this is needed as a foundation for many other refactorings.
EDIT Feb 2014: Full C++14 parser, control flow analysis, local data flow analysis
Another possibility is to develop your own plugin for GCC, or to develop a GCC MELT extension to do your task. But extending GCC (or Clang) requires understanding the internal representations of these compilers (Gimple & Tree for GCC) and this require some work. MELT is a high-level domain specific language to extend GCC.
It's not refactoring, but completion, but might be useful: