|Last modified: 01 Sep 2006||Contact: Stefan Hundhammer|
KDirStat is a graphical disk usage utility, very much like the Unix "du" command. In addition to that, it comes with some cleanup facilities to reclaim disk space.
While KDirStat is a KDE program, it runs fine on every X11 desktop, i.e., it runs on Linux, BSD, and lots of other Unix-type systems (Solaris, HP-UX, AIX, ...).
MS Windows Users please note that there are operating systems and window systems beyond those from Redmond, WA. This may come as a surprise to some people. ;-) There is a MS Windows clone called WinDirStat. Yes, that one is the clone. KDirStat is the original.
|KDirStat main window
(~25 k each)
See the features section for more info.
2006-08-29: Bogus News Story about KDirStat in the Chicago Tribune
There was an article in the Chicago Tribune about Open Source software in general and a program they found useful in particular: WinDirStat. Granted, their advocating Open Source is commendable - even though the author of that article did not seem to have read the available documentation about treemaps and what they are for: He writes about about "Tetris-style jewel colors" without bothering to explain what that thingy is good for.
Whatever: They do introduce WinDirStat and explain its usefulness. And they even include a download hyperlink. But that hyperlink is plain wrong. They mean to hyperlink to WinDirStat, but that address would be http://windirstat.info/, not what they wrote: http://kdirstat.sourceforge.net/. Yes, you read right: They hyperlinked to the KDE/Linux/BSD/Unix* version, not to the MS Windows version.
Of course, a number of people fell for that trap, wondering how to start the kdirstat.tar.bz2 file they downloaded on their MS Windows machines. A few of them even contacted me for help.
Obviously nobody at the Chicaco Tribune bothered to click on their own link to find where it would lead. They would have noticed instantly.
MS Windows Users: You want WinDirStat, not KDirStat. KDirStat runs on KDE running on top of Linux, BSD and other Unix-type operating systems. Even though the mere thought might be alien to many MS Windows users: There are operating systems and window systems beyond those from MS. Not every program with a nice GUI runs on top of MS Windows. ;-)
And just to reiterate: KDirStat is the original. WinDirStat was a port to MS Windows. Yes, it can be that way, too. ;-)
2006-06-01 New Development Version: kdirstat-2.5.3.tar.bz2(Last stable release is 2.4.4)
This is a bug-fix release to 2.5.2. Those of you who used that new cache file feature will have noticed that reading the cache files tended to crash frequently. This is is the main fix for this release.
Sorry it took so long to release this version. If you look into the change log inside the package, you will find that the fixes went to CVS on 6th February. But I never got around to make packages and announce the release here and on Freshmeat and apps.kde.org because at about that time all hell went loose here at SuSE with the libzypp project. With all the overtime we worked during that period I didn't feel very much like spending time on weekends working on KDirStat.
But development will continue. Promised.
2006-01-08 New Development Version: kdirstat-2.5.2.tar.bz2(Last stable release is 2.4.4)
KDirStat can now read and write cache files, i.e., files that contain the disk information KDirStat displays. You can use a Perl script that comes with KDirStat to generate cache files over night in cron jobs and display the content of a very large directory tree in a couple of seconds.
To give some rough numbers, on my laptop it takes KDirStat about 3 minutes to scan /usr. Reading the same information from a cache file takes 3-5 seconds.
On the downside, the disk content may have changed in the meantime. A cache file is outdated by definition. But it may still give you some rough ideas. And there are large directory trees that hardly ever change.
Or you may be a system administrator with a NFS server that houses home directories, and every now and then you have to check exactly who of your users again managed to fill up that file system to 95%. One thing you cannot do (or your users will hate you for it) is start KDirStat during working hours to scan all those home directories. So do that with the kdirstat-cache-writer Perl script in a cron job running in the middle of the night and view the result with KDirStat during your normal office hours.
As a matter of fact, one of our system administrators at SuSE requested this
KDirStat feature for that very reason. So here it is.
This is still in development. It is currently integrated into the KDirStat user interface only very crudely: There are two entries in the "File" menu, "Write to Cache File..." and "Read Cache File...".
There is currently no indication that cached values are displayed. This will have to change.
When you click the "Reload" button, the directory tree is really scanned (just in case you might have thought the cache file is read again). This is intentional.
Tip: If you generate cache files with "kdirstat-cache-writer -l", they will become somewhat (~20%) larger, but you can also use them as a replacement for "locate". Simply use "zgrep" in such a file and ignore the size, mtime etc. fields.
I want to make KDirStat read cache files with a default file name (currently ".kdirstat.cache.gz") automatically when they are found while reading a directory tree and the cache file belongs to that directory tree.
For example, during scanning /home/kilroy, if /home/kilroy/projects/hugeproj is read and it contains a file .kdirstat.cache.gz with the content of /home/kilroy/projects/hugeproj, the content of that cache file is used rather than further scanning everything below /home/kilroy/projects/hugeproj .
That way, you can simply throw a cache file into kdirstat's way for large directories that hardly ever change - or for large directories you don't care too much about anyway.
Of course, there will be a setup option to switch that behaviour off.
I also plan to support exclude lists. Several users requested that feature. If you have directories you don't ever want to appear in KDirStat's display you will be able to add them to an exclude list. KDirStat will stop scanning such directories - very much like with mount points. And as with mount points, you can have KDirStat continue scanning there with two mouse clicks if you want.
The general idea is to have absolute paths like /home/kilroy/oldstuff as well as directory names like ".SVN" that may appear in many places. Some basic wildcards might be supported as well.
The hard part about that might turn out to be the editor for that list. But since that is an advanced feature, maybe simple "vi" or "kedit" will have to make do.
More Future Features
I have not lost track of the single most requested feature for KDirStat: Being able to select multiple files (or, more general, items) to delete.
But... (you had that coming, admit it ;-) )
One thing is pretty sure: Multi-selection will not be a general concept for KDirStat. This will be limited to deleting files or directories.
Maybe KDirStat will get an internal trash can to handle that - maybe in a separate window that will hold all the stuff the user marked for deletion.
Maybe KDirStat will get some kind of check boxes in one of the columns where you can mark items for deletion - and a "delete marked items" menu entry (and tool bar button, of course).
Maybe KDirStat will get tool pointers like paint or drawing programs (GIMP, OOo-draw): One for normal operation and one "kill pointer" that instantly marks items for deletion.
Probably items marked that way will not go away instantly, but displayed in some special way - like dimmed or with strike-through font.
I don't know yet. I am even unsure if that will make it into the next stable
release (2.6.1). I know that many users want this feature, but deleting files
or even directories is a serious affair. This has to work reliably and without
nasty surprises for the users, so I want that to be really thought through
thoroughly (gee, what a tongue breaker...).
Another item on the agenda that is still open is an editor for treemap colors: Users should be able to select in what color to display their most important file types (MIME types) in the treemap display.
For sparse files, now only the amount of disk space actually allocated is added up. The tree display now includes both the nominal size and the actual size of sparse files - like this: 6.3 MB (allocated: 1.3 MB)
Files with multiple hard links are now added up partially for each hard link found and displays them as something like 512 MB / 8 Links.
A 512 MB file with 8 hard links now gets 512 MB / 8 for each time found. If all 8 hard links are within the current directory tree, it adds up to 512 MB. Formerly, this file was simply added up 8 times, thus distorting the overall sum.
All in all, this comes much closer to what "du" reports - "du" takes hard links and sparse files into account, too. (There may still be some difference due to the way of accounting partially used file system clusters, though.)
But in real life, I found that both sparse files and files with multiple hard links are only very rarely used at all these days: In a whole Linux distribution with well over 100.000 files there are usually no more than about 10 sparse files, and the only place I found where hard links are actually used seems to be /usr/lib/locale . Everywhere else symbolic links seem to be prefered these days.
So if you missed this release, you probably don't need to worry: It doesn't make too much of a difference for most users.
2004-12-08 Fixed incomplete tarball - now including configure script
Some people correctly pointed out that I had forgotten to package the configure script with that last kdirstat-2.4.3.tar.bz2. Sorry. Here is the correct one: kdirstat-2.4.3.tar.bz2.
Of course normal users shouldn't need package maintainer tools like automake and autoconf just for building the package - a ready-made configure script should be distributed.
Note: The links from the older news entry now point to the correct tarball to avoid further confusion.
2004-12-06 New (stable) Maintenance Release: kdirstat-2.4.3.tar.bz2
Some minor changes added up, so it's time for a regular release. The most notable change is a new "Open with" action in the "cleanup" menus: You can use that to open any file or directory with an application of your choice, for example your favourite editor.
Something that really amazed me is that the news section of 2003-08-26 contained two links that didn't show up as links - the binary and the source RPMs. But not one person ever complained about that. Strange. ;-)
2003-08-26 New Stable Release: kdirstat-2.4.0.tar.bz2
It's been a while since the last release of KDirStat that was officially declared stable. Nevertheless, the last few releases had turned out to be so rock-solid that it makes perfect sense to simply call the latest (2.3.7) stable - remember, KDirStat follows the Linux kernel tradition of using even release numbers for stable versions and odd numbers for experimental "hacker" versions.
I threw in a fix for a long-standing bug that was more annoying rather than a real problem: When quitting KDirStat while it was still reading directories, it crashed. Well, OK, you wanted it to terminate, but probably not with a core dump. ;-) Anyway, this is fixed now.
In case you are wondering why the source RPM is so much smaller than the source tarball: The tarball comes with a complete admin/ subdirectory that contains all the autoconf / automake magic PLUS it includes a ready-made configure script so you don't need autoconf, automake and libtool for building. All that stuff eats up a lot of disk space and bandwith, however, and the SuSE build logic reimports and recreates all that anyway from the installed system so it's no use including it in the source RPM.
What the Future Will Bring (after 2.4.0)
Treemap Color Editor
The treemap colors that are used now are hard-coded - which of course is bad. There will be a treemap color editor that will allow the user to define his favourite colors according to MIME type or according to file name patterns (which is probably much less expensive in terms of performance) or both.
Right now there are only a few file types that are recognized. As a general rule of thumb: Red or reddish means bad (core dumps, *.bak, *~ etc.) - stuff you probably don't really need or want to have. Green means compressed stuff - optimum use of disk space. Blue means documents of some kind, cyan are executables or related (libs etc.). Yellow is your MP3 or video collection - stuff where you can most likely save a lot of disk space. ;-)
I had planned to wait with 2.4.0 for that color editor, but I hadn't reckoned with a summer in Germany to be that long. Blame it on the weather. ;-)
KDirStat as KPart
Making KDirStat a KPart is not forgotten, but it will definitely come after the color editor - one reason being that getting all that automake/autoconf/libtool stuff right to set this up is a major nightmare.
If anybody feels up to that without moving KDirStat into the KDE CVS I will gladly accept a patch...
This is the single most-wanted enhancement for KDirStat: Being able to select more than just one file or directory for cleanup actions. Unfortunately, this messes up much of the internal logic how selections are handled.
If (and when) I have a really good idea how to do that, I will.
Explicitly Excluding Directories
Somebody asked for this not long ago: Exclude specific directories from reading. It makes sense to me, and it shouldn't be too hard to implement. The harder part of this is probably all the configuration stuff around that. Well, maybe we'll start with a static list in the user's home directory - for people who have not completely forgotten yet how to handle a text editor. ;-)
Fix the PacMan Widget
The PacMan animation widget in the tool bar doesn't support custom widget themes like Keramik that require using Qt's polish magic. This is why it always stands out like a grey blob against the neatly rendered background. It looks very unprofessional.
Anybody up for a patch?
Another Busy State Animation
According to user feedback mails, about as many users seem to like the PacMan animation as not. I would greatly welcome an alternative animation.
Free Disk Space Display
Guess where the disk free display used in the YaST2 package manager comes from? Do the percentage bars look familiar to you? This is something I had planned all along to recycle in KDirStat. I am only unsure whether to open a new window for that or to sequeeze that into the main window or whatever.
Moving KDirStat into the KDE CVS
Over my dead body. ;-)
All kidding aside: I usually don't have time for KDirStat exactly when the KDE folks decide to release a new KDE version or when something breaks deep within because of KDE API changes or because somebody decides that it's way cooler to use the ultra-new experimental Autoconf or Automake or Qt-Lib.
Been there, done that, hated it from the bottom of my heart.
Open Source is usually good quality software primarily because people really like making it. Because it is fun. Because of the aesthetics of the resulting code. But all this is there only as long as there isn't additional outside pressure - pressure like needing to change stuff that used to work perfectly but doesn't work any more with the latest bleeding edge (and boy is that stuff bloody!) Autoconf or Libtool or whatever. This is no fun. This can drive you crazy.
2002-08-19 New RPMs for RedHat Linux
2003-05-25 Performance Boost: kdirstat-2.3.7.tar.bz2 (Development Version)
Milos Prudek pointed out to me that there was a huge performance increase if you minimized KDirStat's window during directory reading. I couldn't quite believe that: I had already optimized KDirStat to perform screen updates no more than three times a second. But he actually went through the trouble of benchmarking those scenarios to convince me.
He was correct. Minimizing the window made directory reading roughly 20 times faster!
After some investigations I found out that items of KDirStat's internal tree that holds directory information were cloned as QListView items much too generously, leading to about as many QListView items as there are directories in the tree you are scanning. Performance suffers a lot for QListView widgets with some 150000 items. Oops...
This was clearly not desired: Only those items you can see (or those you opened manually) should be cloned from the interal tree to the QListView widget. This is (the most important part of) what I fixed for this release.
The resulting benchmarks are impressive:
2003-04-10 RPMs for Mandrake 9.1
2003-02-03 New Bugfix Release: kdirstat-2.3.6.tar.bz2 (Development Version)
2003-01-30 Colored Treemap: kdirstat-2.3.5.tar.bz2 (Development Version)
Treemap Now Colored According to File Type
Known file types get their own special colors so you can see at a glance from the colors if it's your MP3 collection, images, compiled objects or other stuff that clutter up your disk. You can now locate core dumps (red), images (cyan), object files (orange) or archives (green) hidden deep in the directory hierarchy.
I found it really amazing that this little more color helped me see images and archives in deeply nested system directories in a treemap that covered my entire root file system.
The current color rules are still static, but of course in future version this will be user customizable. Stay tuned.
2003-01-07 New (Development) Version: kdirstat-2.3.4.tar.bz2
2003-01-05 Treemaps are back: kdirstat-2.3.3.tar.bz2
As the odd (2.3.x) release number implies, this is a development version. It may still have a few bugs (even though I found it pretty stable). So, please don't use this version in a critical production environment. If you really must use this version in a nuclear power plant, please make sure it's far away from central Europe. ;-)
So, what is it all about? (more at the treemap features section)
For those of you who don't know anything about treemaps yet: A treemap is just another way to display a tree where each node has an associated value - such as a directory tree with file sizes. All the end user needs to know is that each rectangle in a treemap corresponds to one file (or directory), and the larger the rectangle is, the larger is that file (or directory).
Using this is really easy: Look at the treemap. Find large rectangles. Click on them - voila, they are selected in the traditional tree (list) above, and you can see more details. Even better, if you find you don't want that file any more, you can immediately use KDirStat's cleanup actions and delete it (or compress it or whatever).
Thus, the treemap can help you pinpoint large files that are hidden deep within a directory tree. You can see the entire tree at once in the treemap, so you can easily identify large files.
On the other hand, the traditional tree view (the list) is far better at showing accumulated sizes for entire directory branches. The combination of both is what makes a really powerful tool: Both views are slaved to one another, so if you select a file or directory in the treemap, it is automatically located and selected in the tree view - and vice versa.
The authors of SequoiaView of the Technical University Eindhoven in the Netherlands kindly made their papers about treemaps available to the public (thanks again, folks!). Those papers were the base for this treemap implementation.
As the treemap implementation evolved (hence the version number 2.3.3 - 2.3.0, 2.3.1, and 2.3.2 existed briefly, but I didn't have a chance to release them to the public - believe it or not, no Internet access where I wrote that), there were several variants of treemaps:
Since there is so much room for experimenting, I chose to make all those combinations available to the user. If you don't like the treemap settings, tweak them to meet your personal preferences:
More details at the treemap features section.
Other Improvements of this Version
Plans for the Future
Before we are heading for KDirStat-2.4.0 (the next version intended for use in production environments), I want to make sure that stability and resource consumption hasn't suffered because of all the changes due to treemaps. The prime objective for a "stable" (even numbered, i.e 2.4.x) release is to make sure it is worth that predicate.
As of features, I'd like to make treemaps MIME-type aware before the next stable release: Treemap tiles should be colored according to the type of the corresponding file. If you have lots of MP3s or PNGs or object files on your hard disk, you should be able to tell that from how many blue or yellow or whatever rectangles you see. Of course this will be user configurable to a high degree.
But what bugs me most is how to make that really efficient - KDE's MIME type recognition is really great, but it can require a lot of performance: KDE doesn't just look at filename extensions, it also looks into the file to look for magic numbers in headers etc.
Maybe this will be a simple implementation: Just according to filename extensions, no magic numbers. But on the other hand, I'd like to use the host of MIME types that KDE already knows - all files classified as "application/audio" (MP3, WAV, whatever), all files classified as "application/image" (PNG, JPG, GIF, TIF, BMP, ...) etc.
Maybe it will be a hyprid approach: Simply filename extension lookup for small files (i.e., small resulting treemap tiles), exact but expensive full-fledged MIME type recognition for larger ones. Maybe KDirStat should build its own internal cache for those expensive operations so they are not required all over again just because a directory has to be re-read or because the treemap is rebuilt - which it is very frequently: Complete rebuilds are necessary for all kinds of changes, otherwise the treemap would get "holes" where deleted files have been.
P.S. Comments are welcome, as always.
2002-10-23 New RPM for RedHat 8.0
kdirstat-2.2.0-1.i386.rpm (RedHat only!)
2002-08-08 Current Development Status
I am in the process of reimplementing treemaps for KDirStat.
The CVS version already has a (albeit pretty experimental) version of that,
On the pro side, communication between the traditional tree view and the tree map view works pretty well: If you click on a tile in the treemap, the corresponding item in the tree view is selected, scrolled to be visible and opened, closing all previously opened items. This makes it pretty easy to find large single files, even if they are nested deeply within the directory hierarchy.
What I intend to do now is rework that part to make more efficient use of system resources (maybe QCanvas is the way to go), reintroduce pretty 3D shading of some kind and use different colors to visualize different types of files so you can see if it's your MP3 collection, images from the web, core dumps or something else that uses your disk space.
Oh yes, and I flatly refuse to recycle any of the old treemap code. It cost me too much precious time already trying to make any sense of that, and there are aesthetical reasons as well (remember, this is a purely private fun project, and without the fun part there isn't too much left...).
2002-08-08 New RPMs for RedHat 7.3
Get the latest tarball from the download area and read the build instructions.
Usually there is also an RPM package that runs on the latest SuSE Linux distribution, so you don't need to bother building KDirStat yourself if you are using SuSE Linux.
For RedHat RPMs, you can look at the Guru Labs download pages - they may or may not have the latest version available as RPMs. If they don't have the latest version yet, please be patient - it may be coming soon.
The KDirStat CVS is up and running at
Both read/write access for the development team and public (anonymous) read-only access available.
For those who need to be on the bleeding edge of development, there is public CVS access kindly hosted at
Stays on one file system by default - reads mounted file systems only on request.
You don't care about a mounted /usr file system if the root file system is full and you need to find out why in a hurry, nor do you want to scan everybody's home directory on the NFS server when your local disk is full.