Sparse Semantic Parser
[kernel
linux
c
static-analysis
software-development
]
Sparse is a C language “semantic parser” originally written by Linus Torvalds to support his work on the Linux kernel and used to detect bugs with a static analysis.
It was designed, according to the README file, to be “small — and simple” and particularly to be “easy to use“. Reasons to use a simple C parser could include data mining (to summarize particular features of some code, for example), analysis (possibly to look for troublesome patterns), or visualization (to make it easier to understand or navigate around a large code set). In support of this reuse, sparse is licensed under the permissive MIT License and is structured as a library that other tools can easily incorporate. This library is accompanied by a number of tools that demonstrate some of those reuse possibilities.
Sparse provides functionality to simplify that AST so that particular features of the code can stand out, but keeps the focus fairly local. In particular, it doesn’t support any significant data-flow analysis to detect how values change across a sequence of code.
Install it
If you have a Debian-like distro, you can install it simply with:
sudo apt-get install sparse
If a precompiled package for sparse is not provided you can proceed with a manual installation from git repository:
$ git clone git://git.kernel.org/pub/scm/devel/sparse/sparse.git
$ cd sparse
$ make
$ make install
Getting started
To run sparse checks, we have to set the C argument when compiling the kernel:
# To run sparse on files that get recompiled
$ make C=1
# To run sparse on all files whether they need to be recompiled or not
$ make C=2
# To save the warnings to a file
$ make C=2 2>outfile
# Choose a folder to check sparse errors and warnings
$ make C=2 drivers/staging/
Ouput
Sparse reports various errors and warnings while examining the code. A large class of the warnings and errors that Sparse generates come from extending the C language is various ways. In fact, Sparse defines the macro CHECKER so that the use of these extensions can be made visible only to sparse, not to other C compilers.
Here an example of Sparse output
drivers/gpu/drm/i915/i915_drm_client.c:92:9: warning: incorrect type in argument 1 (different address spaces)
drivers/gpu/drm/i915/i915_drm_client.c:92:9: expected struct list_head const *list
drivers/gpu/drm/i915/i915_drm_client.c:92:9: got struct list_head [noderef] __rcu *pos
drivers/gpu/drm/i915/i915_drm_client.c:92:9: error: incompatible types in comparison expression (different address spaces):
drivers/gpu/drm/i915/i915_drm_client.c:92:9: struct list_head [noderef] __rcu *
drivers/gpu/drm/i915/i915_drm_client.c:92:9: struct list_head *
drivers/gpu/drm/i915/i915_drm_client.c:92:9: error: incompatible types in comparison expression (different address spaces):
drivers/gpu/drm/i915/i915_drm_client.c:92:9: struct list_head [noderef] __rcu *
drivers/gpu/drm/i915/i915_drm_client.c:92:9: struct list_head *
Some of these extensions, like __range__
, are new statements, but most are attributes that can be attached to variable and type declarations using the GCC attribute syntax extension. These can provide extra information about how a variable or type should be used so sparse can warn when the expectations are not met.
For a more deep dive into Spase, please, refer to this LWN.net article.
The list of warnings and errros Spase can raise is available here.