Last modified: 01 Sep 2006 Contact: Stefan Hundhammer

KDirStat

What is it?

KDirStat is a graphical disk usage utility, very much like the Unix "du" command. In addition to that, it comes with some cleanup facilities to reclaim disk space.

While KDirStat is a KDE program, it runs fine on every X11 desktop, i.e., it runs on Linux, BSD, and lots of other Unix-type systems (Solaris, HP-UX, AIX, ...).

MS Windows Users please note that there are operating systems and window systems beyond those from Redmond, WA. This may come as a surprise to some people. ;-) There is a MS Windows clone called WinDirStat. Yes, that one is the clone. KDirStat is the original.

KDirStat main window
KDirStat cleanup action configuration KDirStat tree colors configuration
KDirStat treemap configuration KDirStat general/misc configuration
KDirStat feedback mail dialog
KDirStat main window
(137 k)
Configuration dialogs
(~25 k each)
Feedback mail
(32 k)

See the features section for more info.









News

2006-08-29: Bogus News Story about KDirStat in the Chicago Tribune

There was an article in the Chicago Tribune about Open Source software in general and a program they found useful in particular: WinDirStat. Granted, their advocating Open Source is commendable - even though the author of that article did not seem to have read the available documentation about treemaps and what they are for: He writes about about "Tetris-style jewel colors" without bothering to explain what that thingy is good for.

Whatever: They do introduce WinDirStat and explain its usefulness. And they even include a download hyperlink. But that hyperlink is plain wrong. They mean to hyperlink to WinDirStat, but that address would be http://windirstat.info/, not what they wrote: http://kdirstat.sourceforge.net/. Yes, you read right: They hyperlinked to the KDE/Linux/BSD/Unix* version, not to the MS Windows version.

Of course, a number of people fell for that trap, wondering how to start the kdirstat.tar.bz2 file they downloaded on their MS Windows machines. A few of them even contacted me for help.

Obviously nobody at the Chicaco Tribune bothered to click on their own link to find where it would lead. They would have noticed instantly.

MS Windows Users: You want WinDirStat, not KDirStat. KDirStat runs on KDE running on top of Linux, BSD and other Unix-type operating systems. Even though the mere thought might be alien to many MS Windows users: There are operating systems and window systems beyond those from MS. Not every program with a nice GUI runs on top of MS Windows. ;-)

And just to reiterate: KDirStat is the original. WinDirStat was a port to MS Windows. Yes, it can be that way, too. ;-)

2006-06-01 New Development Version: kdirstat-2.5.3.tar.bz2

(Last stable release is 2.4.4)

This is a bug-fix release to 2.5.2. Those of you who used that new cache file feature will have noticed that reading the cache files tended to crash frequently. This is is the main fix for this release.

Changes:

  • Fixed crash on reading cache files.
  • Fixed crash on directory reloading and (sometimes) initial directory reading.
  • Fixed bogus "sparse files" reports.

    A lot of files were reported as being sparse that actually were not. Some file system types seem to have very creative (and of course highly undocumented) ways to store block fragments, so their block counts would not match the corresponding byte size of files very often. NTFS is one example for that.


Tip:

The new cache file reading and writing feature can be used with server machines that don't even have X11 or KDE installed. If they have Perl, you can use the supplied Perl script kdirstat-cache-writer to scan directory trees in cron jobs over night on the server. View the result with KDirStat on any other machine that has X11 and KDE whenever it is convenient - without creating I/O or CPU load on the server.


RPM for SuSE Linux 10.1: kdirstat-2.5.3-0.1.i586.rpm (328 K)
Source RPM for SuSE Linux: kdirstat-2.5.3-0.1.src.rpm (245 K)
Spec File for SuSE Linux: kdirstat.spec (3.3 K)
Source tarball: kdirstat-2.5.3.tar.bz2 (662 K)

Sorry it took so long to release this version. If you look into the change log inside the package, you will find that the fixes went to CVS on 6th February. But I never got around to make packages and announce the release here and on Freshmeat and apps.kde.org because at about that time all hell went loose here at SuSE with the libzypp project. With all the overtime we worked during that period I didn't feel very much like spending time on weekends working on KDirStat.

But development will continue. Promised.

2006-01-08 New Development Version: kdirstat-2.5.2.tar.bz2

(Last stable release is 2.4.4)

KDirStat can now read and write cache files, i.e., files that contain the disk information KDirStat displays. You can use a Perl script that comes with KDirStat to generate cache files over night in cron jobs and display the content of a very large directory tree in a couple of seconds.

To give some rough numbers, on my laptop it takes KDirStat about 3 minutes to scan /usr. Reading the same information from a cache file takes 3-5 seconds.

On the downside, the disk content may have changed in the meantime. A cache file is outdated by definition. But it may still give you some rough ideas. And there are large directory trees that hardly ever change.

Or you may be a system administrator with a NFS server that houses home directories, and every now and then you have to check exactly who of your users again managed to fill up that file system to 95%. One thing you cannot do (or your users will hate you for it) is start KDirStat during working hours to scan all those home directories. So do that with the kdirstat-cache-writer Perl script in a cron job running in the middle of the night and view the result with KDirStat during your normal office hours.

As a matter of fact, one of our system administrators at SuSE requested this KDirStat feature for that very reason. So here it is.

This is still in development. It is currently integrated into the KDirStat user interface only very crudely: There are two entries in the "File" menu, "Write to Cache File..." and "Read Cache File...".

There is currently no indication that cached values are displayed. This will have to change.

When you click the "Reload" button, the directory tree is really scanned (just in case you might have thought the cache file is read again). This is intentional.

Tip: If you generate cache files with "kdirstat-cache-writer -l", they will become somewhat (~20%) larger, but you can also use them as a replacement for "locate". Simply use "zgrep" in such a file and ignore the size, mtime etc. fields.
RPM for SuSE Linux 10.0: kdirstat-2.5.2-0.1.i586.rpm (333 K)
Source RPM for SuSE Linux: kdirstat-2.5.2-0.1.src.rpm (840 K)
Source tarball: kdirstat-2.5.2.tar.bz2 (662 K)

BTW Kudos to the guys who wrote zlib: This is an ingenious piece of software.

Not only makes it reading and writing compressed files incredibly easy, it also does that in a very intuitive way: Simply replace "fopen()" with "gzopen()", "fprintf()" with "gzprintf()", "fgets()" with "gzgets()" etc. and you are done. It can even read uncompressed files as well as compressed files, so you don't need to duplicate any work.

This is how libraries are supposed to be written! Thanks a lot, guys!


The Vision

I want to make KDirStat read cache files with a default file name (currently ".kdirstat.cache.gz") automatically when they are found while reading a directory tree and the cache file belongs to that directory tree.

For example, during scanning /home/kilroy, if /home/kilroy/projects/hugeproj is read and it contains a file .kdirstat.cache.gz with the content of /home/kilroy/projects/hugeproj, the content of that cache file is used rather than further scanning everything below /home/kilroy/projects/hugeproj .

That way, you can simply throw a cache file into kdirstat's way for large directories that hardly ever change - or for large directories you don't care too much about anyway.

Of course, there will be a setup option to switch that behaviour off.

I also plan to support exclude lists. Several users requested that feature. If you have directories you don't ever want to appear in KDirStat's display you will be able to add them to an exclude list. KDirStat will stop scanning such directories - very much like with mount points. And as with mount points, you can have KDirStat continue scanning there with two mouse clicks if you want.

The general idea is to have absolute paths like /home/kilroy/oldstuff as well as directory names like ".SVN" that may appear in many places. Some basic wildcards might be supported as well.

The hard part about that might turn out to be the editor for that list. But since that is an advanced feature, maybe simple "vi" or "kedit" will have to make do.

More Future Features

I have not lost track of the single most requested feature for KDirStat: Being able to select multiple files (or, more general, items) to delete.

But... (you had that coming, admit it ;-) )
what I still don't have is a clear concept how to integrate that with some of KDirStat's features like the generic cleanup action concept or the treemap. Both features are very intuitive to use as long as there is no more than one selected item. But when there are many of them, what should happen? Would the cleanups still work the way the user expects them to work? Would you still know what in the treemap corresponds to what in the tree view? This is what needs to be figured out.

One thing is pretty sure: Multi-selection will not be a general concept for KDirStat. This will be limited to deleting files or directories.

Maybe KDirStat will get an internal trash can to handle that - maybe in a separate window that will hold all the stuff the user marked for deletion.

Maybe KDirStat will get some kind of check boxes in one of the columns where you can mark items for deletion - and a "delete marked items" menu entry (and tool bar button, of course).

Maybe KDirStat will get tool pointers like paint or drawing programs (GIMP, OOo-draw): One for normal operation and one "kill pointer" that instantly marks items for deletion.

Probably items marked that way will not go away instantly, but displayed in some special way - like dimmed or with strike-through font.

I don't know yet. I am even unsure if that will make it into the next stable release (2.6.1). I know that many users want this feature, but deleting files or even directories is a serious affair. This has to work reliably and without nasty surprises for the users, so I want that to be really thought through thoroughly (gee, what a tongue breaker...).

Another item on the agenda that is still open is an editor for treemap colors: Users should be able to select in what color to display their most important file types (MIME types) in the treemap display.

Volunteers?

2006-01-08 (originally from 2005-02-22) New Stable Release: kdirstat-2.4.4.tar.bz2

For some reason, this news item never made it out to the world until now. Sorry for that.

KDirStat learned how to deal with hard links and with sparse files.

For sparse files, now only the amount of disk space actually allocated is added up. The tree display now includes both the nominal size and the actual size of sparse files - like this: 6.3 MB (allocated: 1.3 MB)

Files with multiple hard links are now added up partially for each hard link found and displays them as something like 512 MB / 8 Links.

A 512 MB file with 8 hard links now gets 512 MB / 8 for each time found. If all 8 hard links are within the current directory tree, it adds up to 512 MB. Formerly, this file was simply added up 8 times, thus distorting the overall sum.

All in all, this comes much closer to what "du" reports - "du" takes hard links and sparse files into account, too. (There may still be some difference due to the way of accounting partially used file system clusters, though.)

But in real life, I found that both sparse files and files with multiple hard links are only very rarely used at all these days: In a whole Linux distribution with well over 100.000 files there are usually no more than about 10 sparse files, and the only place I found where hard links are actually used seems to be /usr/lib/locale . Everywhere else symbolic links seem to be prefered these days.

So if you missed this release, you probably don't need to worry: It doesn't make too much of a difference for most users.
RPM for SuSE Linux 10.0: included in SuSE 10.0 default distribution
RPM for SuSE Linux 9.3: uh - dunno, check rpmseek.com
RPM for SuSE Linux 9.2: kdirstat-2.4.4-1.1.i586.rpm (324 K)
Source RPM for SuSE Linux: kdirstat-2.4.4-1.1.src.rpm (242 K)
Source tarball: kdirstat-2.4.3.tar.bz2 (623 K)

Technical Background: Hard Links

In Unix/Linux file systems, files primarily have a numeric ID, their "i-number", the index of the corresponding "i-node", the file system's administrative information block. Each directory entry of a file really is no more than a link to that i-node. You can have the very same file under several distinct names this way - even in different directories. The only limitation is that this is restricted to one file system (i.e. to one disk partition) because those i-numbers are unique only per file system.

Hard links can also introduce a whole new dimenstion of problems with applications that create backup copies of working files - they usually only rename the original file to a backup name and write their content to a new file. Editors usually work that way. This however means that any additional hard links to that file now point to the outdated backup copy - which is normally not what is desired. Only very few applications handle this reasonably. So the bottom line is: Use hard links only if you know very well what you are doing.

All this is probably why symbolic links have become so much more popular in recent years: They can also point to different file systems, even (via NFS) to different hosts in the network. On the downside, symlinks can also be stale - pointing into nothingness. This cannot happen with hard links: A file is only really deleted when the last of its links is deleted (this includes open i-nodes in memory - i.e., processes still having an open file handle to that i-node).

Directories rely completely on hard links (this is also why KDirStat does not attempt to try anything smart with multiple-hard-link directories - it would make no sense): The ".." entries in each directory pointing to its parent is nothing else than another hard link to that parent (named ".."), and "." is nothing else than a hard link to itself. This is also why even a completely empty directory has a link count of 2 - one for "." in its own directory, one for its name in its parent directory.

For more information, please refer to the FAQ section of the online help of KDirStat-2.4.4 or later.

Technical Background: Sparse Files

Unix / Linux file systems know the concept of so-called "sparse files" (also known as "files with holes"): Files that largely consist of zeroes and only very few data blocks. This can mean that a file shows, say, 6.3 MB with "ls", but only, say, 1.3 MB of that is actually allocated - the rest is just zeroes.

This is typical for core dumps (memory images of crashed programs written to a file named "core" or "core.*") or binary database files: The kernel writes those files in a way so only real data content is allocated on disk and not the large amount of zeroes.

Technically, a sparse file is created with the regular open() system call to open the file for writing, then using lseek() to extend the file size beyond its previous size and then writing at least one byte. The area between the old and the new file size becomes a "hole" in the file - it is not actually allocated on the disk. Upon reading this area, a value of zero is returned for each byte read. When bytes are written to that area, file system blocks are actually allocated, possibly creating two smaller holes before and after the area written to.

Please note that most file utilities do not deal graciously with sparse files. Those that support them at all normally need special command line arguments. Otherwise they tend to simply reading all bytes (including all the zeroes from the holes) and writing them to a new location - which of course means that the resulting file is no longer sparse, but really occupies all the space its size indicates. This may mean that you can blow up the above 6.3 MB core dump file from 1.3 MB disk usage (and 5 MB zeroes in holes) to really 6.3 MB disk usage.

For more information, please refer to the FAQ section of the online help of KDirStat-2.4.4 or later.


2004-12-08 Fixed incomplete tarball - now including configure script

Some people correctly pointed out that I had forgotten to package the configure script with that last kdirstat-2.4.3.tar.bz2. Sorry. Here is the correct one: kdirstat-2.4.3.tar.bz2.

Of course normal users shouldn't need package maintainer tools like automake and autoconf just for building the package - a ready-made configure script should be distributed.

Note: The links from the older news entry now point to the correct tarball to avoid further confusion.

2004-12-06 New (stable) Maintenance Release: kdirstat-2.4.3.tar.bz2

Some minor changes added up, so it's time for a regular release. The most notable change is a new "Open with" action in the "cleanup" menus: You can use that to open any file or directory with an application of your choice, for example your favourite editor.

Something that really amazed me is that the news section of 2003-08-26 contained two links that didn't show up as links - the binary and the source RPMs. But not one person ever complained about that. Strange. ;-)
RPM for SuSE Linux 9.2: kdirstat-2.4.3-0.1.i586.rpm (313 K)
Source RPM for SuSE Linux: kdirstat-2.4.3-0.1.src.rpm (231 K)
Source tarball: kdirstat-2.4.3.tar.bz2 (623 K)

Changes:

  • Added "Open with" cleanup upon request by Jarl Friis <jarl@softace.dk>
  • Migration to KIO slave trash:/ for "move to trash" cleanup (querying KDE version >= 3.4 at runtime)
  • Added configuration update for safer transition from old-style fixed "*/Trash" paths to "%t" placeholder
  • Fixed lots of KDE libs "deprecated" warnings
  • Reimported admin/ subdir from a recent KDE version (3.3.0)
  • Fixed KPacMan rendering in toolbar (thanks to Coolo)
  • Updated German translation
  • Added Italian translation by Giuliano Colla <colla@copeca.it>
  • Applied i18n patch by Toyohiro Asukai <toyohiro@ksmplus.com>
  • Updated Japanese translation by Toyohiro Asukai <toyohiro@ksmplus.com>
  • Fixed treemap context menu popup location
  • Added Hungarian translation contributed by Marcel Hilzinger <hili@suselinux.hu>

2003-08-26 New Stable Release: kdirstat-2.4.0.tar.bz2

It's been a while since the last release of KDirStat that was officially declared stable. Nevertheless, the last few releases had turned out to be so rock-solid that it makes perfect sense to simply call the latest (2.3.7) stable - remember, KDirStat follows the Linux kernel tradition of using even release numbers for stable versions and odd numbers for experimental "hacker" versions.

I threw in a fix for a long-standing bug that was more annoying rather than a real problem: When quitting KDirStat while it was still reading directories, it crashed. Well, OK, you wanted it to terminate, but probably not with a core dump. ;-) Anyway, this is fixed now.

RPM for SuSE Linux 8.2: kdirstat-2.4.0-0.i586.rpm (267 K)
Source RPM for SuSE Linux: kdirstat-2.4.0-0.src.rpm (218 K)
Source tarball: kdirstat-2.4.0.tar.bz2 (625 K)

In case you are wondering why the source RPM is so much smaller than the source tarball: The tarball comes with a complete admin/ subdirectory that contains all the autoconf / automake magic PLUS it includes a ready-made configure script so you don't need autoconf, automake and libtool for building. All that stuff eats up a lot of disk space and bandwith, however, and the SuSE build logic reimports and recreates all that anyway from the installed system so it's no use including it in the source RPM.

Changes Overview:

  • Fixed crash when quitting program while directory reading in progress.
  • Fixed crash when trying to open another directory for reading while reading still in progress. This is really just a variant of the above bug.
  • Added German translation contributed by Christoph Eckert.
  • Added Stop Reading toolbar button (and menu entry) to abort directory reading while in progress.

    Many users had asked that for that for quite a while. It doesn't make too much sense to me: The overall sums are greatly distorted when you do that - depending on what has been read and what not yet. This makes relative sizes (percentages) pretty useless. Directories that were not finished reading now get a special "stop" icon and their "total" fields are prefixed with ">" (e.g., ">42 MB") to indicate that the real number is most likely higher.

    It is arguable whether or not creating a treemap view makes any sense while the numbers are not complete. After considering the pros and cons I decided to display the treemap view even after hitting that new "stop" button; after all, you can still pick some very fat files out of what is already read. You can forget the relative directory sizes, though (albeit I guess nobody uses the treemap view for that anyway - that makes much more sense in the normal tree view).

What the Future Will Bring (after 2.4.0)

Treemap Color Editor

The treemap colors that are used now are hard-coded - which of course is bad. There will be a treemap color editor that will allow the user to define his favourite colors according to MIME type or according to file name patterns (which is probably much less expensive in terms of performance) or both.

Right now there are only a few file types that are recognized. As a general rule of thumb: Red or reddish means bad (core dumps, *.bak, *~ etc.) - stuff you probably don't really need or want to have. Green means compressed stuff - optimum use of disk space. Blue means documents of some kind, cyan are executables or related (libs etc.). Yellow is your MP3 or video collection - stuff where you can most likely save a lot of disk space. ;-)

I had planned to wait with 2.4.0 for that color editor, but I hadn't reckoned with a summer in Germany to be that long. Blame it on the weather. ;-)

KDirStat as KPart

Making KDirStat a KPart is not forgotten, but it will definitely come after the color editor - one reason being that getting all that automake/autoconf/libtool stuff right to set this up is a major nightmare.

If anybody feels up to that without moving KDirStat into the KDE CVS I will gladly accept a patch...

Multiple Selections

This is the single most-wanted enhancement for KDirStat: Being able to select more than just one file or directory for cleanup actions. Unfortunately, this messes up much of the internal logic how selections are handled.

If (and when) I have a really good idea how to do that, I will.

Explicitly Excluding Directories

Somebody asked for this not long ago: Exclude specific directories from reading. It makes sense to me, and it shouldn't be too hard to implement. The harder part of this is probably all the configuration stuff around that. Well, maybe we'll start with a static list in the user's home directory - for people who have not completely forgotten yet how to handle a text editor. ;-)

Fix the PacMan Widget

The PacMan animation widget in the tool bar doesn't support custom widget themes like Keramik that require using Qt's polish magic. This is why it always stands out like a grey blob against the neatly rendered background. It looks very unprofessional.

Anybody up for a patch?

Another Busy State Animation

According to user feedback mails, about as many users seem to like the PacMan animation as not. I would greatly welcome an alternative animation.

Free Disk Space Display

Guess where the disk free display used in the YaST2 package manager comes from? Do the percentage bars look familiar to you? This is something I had planned all along to recycle in KDirStat. I am only unsure whether to open a new window for that or to sequeeze that into the main window or whatever.

Opinions?

Moving KDirStat into the KDE CVS

Over my dead body. ;-)

All kidding aside: I usually don't have time for KDirStat exactly when the KDE folks decide to release a new KDE version or when something breaks deep within because of KDE API changes or because somebody decides that it's way cooler to use the ultra-new experimental Autoconf or Automake or Qt-Lib.

Been there, done that, hated it from the bottom of my heart.

Open Source is usually good quality software primarily because people really like making it. Because it is fun. Because of the aesthetics of the resulting code. But all this is there only as long as there isn't additional outside pressure - pressure like needing to change stuff that used to work perfectly but doesn't work any more with the latest bleeding edge (and boy is that stuff bloody!) Autoconf or Libtool or whatever. This is no fun. This can drive you crazy.


2002-08-19 New RPMs for RedHat Linux

Daniel Tschan kindly provided KDirStat 2.3.7 RPMs for various versions of RedHat Linux:

RedHat 9: kdirstat-2.3.7-1.i386.rpm
RedHat 8.0: kdirstat-2.3.7-1.i386.rpm
RedHat 7.3: kdirstat-2.3.7-1.i386.rpm
Source RPM: kdirstat-2.3.7-1.src.rpm (same for all versions of RedHat Linux)

2003-05-25 Performance Boost: kdirstat-2.3.7.tar.bz2 (Development Version)

Milos Prudek pointed out to me that there was a huge performance increase if you minimized KDirStat's window during directory reading. I couldn't quite believe that: I had already optimized KDirStat to perform screen updates no more than three times a second. But he actually went through the trouble of benchmarking those scenarios to convince me.

He was correct. Minimizing the window made directory reading roughly 20 times faster!

After some investigations I found out that items of KDirStat's internal tree that holds directory information were cloned as QListView items much too generously, leading to about as many QListView items as there are directories in the tree you are scanning. Performance suffers a lot for QListView widgets with some 150000 items. Oops...

This was clearly not desired: Only those items you can see (or those you opened manually) should be cloned from the interal tree to the QListView widget. This is (the most important part of) what I fixed for this release.

The resulting benchmarks are impressive:
Test Case Size Items Subdirs Old read time New read time Improvement factor
~sh (My home directory) 120 MB 3100 340 27 sec. 0.6 sec. 45 (!)
/work 500 MB 36700 3570 1:03 (63 sec.) 8 sec. 7.9
/ (root dir. of SuSE Linux 8.2 Professional installation default+development) 2.85 GB 151000 9900 6:37 (397 sec.) 25 sec. 15.9

Other changes:

RPM for SuSE Linux 8.2: kdirstat-2.3.7-0.i586.rpm
Source RPM: kdirstat-2.3.7-0.src.rpm

2003-04-10 RPMs for Mandrake 9.1

Sir Pingus kindly provided KDirStat 2.3.6 RPMs for Mandrake Linux 9.1:
RPM for Mandrake Linux 9.1: kdirstat-2.3.6-1mdk.i586.rpm
Source RPM: kdirstat-2.3.6-1mdk.src.rpm
Spec File for Mandrake Linux 9.1: kdirstat.spec

2003-02-03 New Bugfix Release: kdirstat-2.3.6.tar.bz2 (Development Version)

RPM for SuSE Linux 8.1: kdirstat-2.3.6-0.i586.rpm
Source RPM: kdirstat-2.3.6-0.src.rpm

Changes:

  • Fixed crash on startup when no config file was present.
  • Fixed crash in treemaps when deleting subtrees in a cleanup action.
  • Improved enabling/disabling of treemap actions.

2003-01-30 Colored Treemap: kdirstat-2.3.5.tar.bz2 (Development Version)

KDirStat with colored treemap
RPM for SuSE Linux 8.1: kdirstat-2.3.5-0.i586.rpm
Source RPM: kdirstat-2.3.5-0.src.rpm

Treemap Now Colored According to File Type

Known file types get their own special colors so you can see at a glance from the colors if it's your MP3 collection, images, compiled objects or other stuff that clutter up your disk. You can now locate core dumps (red), images (cyan), object files (orange) or archives (green) hidden deep in the directory hierarchy.

I found it really amazing that this little more color helped me see images and archives in deeply nested system directories in a treemap that covered my entire root file system.

The current color rules are still static, but of course in future version this will be user customizable. Stay tuned.

Other Changes:

  • Added new '%t' cleanup placeholder for the KDE trash directory. This makes KDirStat more robust against changing that path - eiter between different KDE versions or due to user customizations.

    Note: It is recommended to change existing configurations to use this '%t' placeholder: Either click Defaults in the KDirStat settings (you will lose all customizations!) or change it manually:

    Open the Settings dialog, select the Cleanups tab, locate the "Delete (to Trash Bin)" cleanup and change the command line from:

       kfmclient move %p ~/KDesktop/Trash
    
    to:
       kfmclient move %p %t


  • Read jobs are now displayed in the percentage bar column - there is no more additional tree view column added to the right that scrolls out of visibility all the time.
  • User cleanups now have applicaton-wide keyboard shortcuts (Ctrl-0, Ctrl-1, Ctrl-2, ...).
  • Fixed treemap segfaults when re-reading directories.
  • Synchronize treemap selection with dir tree after treemap rebuild (the current selection had been lost in the treemap after a treemap rebuild).
  • Changed activity point handling: The user was prompted far to early to send feedback mail.

    This was because clicking into the treemap opened and closed a lot of items in the tree view which in turn accumulted many "activity points" that triggered that popup that asks for feedback mail. Of course I didn't have that problem since I am long since way beyond that activity point boundary. Oops. ;-)

  • Changed treemap double click handling:

    Now double clicking the middle button zooms out, double clicking the right button does nothing (it pops up the context menu before receiving the second click anyway).


2003-01-07 New (Development) Version: kdirstat-2.3.4.tar.bz2

RPM for SuSE Linux 8.1: kdirstat-2.3.4-0.i586.rpm
Source RPM: kdirstat-2.3.4-0.src.rpm

Changes:

  • Gcc 3.x fixes
  • Updated admin subdir to latest KDE autoconf / automake stuff
  • Tweaked treemap cushion ridges: Squarified layout row now gets its own ridge, no more double /triple ridges for directories.
  • Changed treemap cushion light source direction from bottom right to top left.
  • Moved min/max/default for treemap settings to central header file.
  • Changed max/default treemap setting values.
  • Reduced settings dialogs outer borders: No more accumulated borders.


2003-01-05 Treemaps are back: kdirstat-2.3.3.tar.bz2

KDirStat with treemap
Source RPM: kdirstat-2.3.3-0.src.rpm
RPM for SuSE Linux 8.0: kdirstat-2.3.3-0.i386.rpm
Warning: This RPMs does not run on SuSE Linux 8.1!
(C++ compiler ABI changes)
RPM for SuSE Linux 8.1: follows shortly - I have been too lazy so far to update to 8.1 at home.

As the odd (2.3.x) release number implies, this is a development version. It may still have a few bugs (even though I found it pretty stable). So, please don't use this version in a critical production environment. If you really must use this version in a nuclear power plant, please make sure it's far away from central Europe. ;-)

So, what is it all about? (more at the treemap features section)

For those of you who don't know anything about treemaps yet: A treemap is just another way to display a tree where each node has an associated value - such as a directory tree with file sizes. All the end user needs to know is that each rectangle in a treemap corresponds to one file (or directory), and the larger the rectangle is, the larger is that file (or directory).

Using this is really easy: Look at the treemap. Find large rectangles. Click on them - voila, they are selected in the traditional tree (list) above, and you can see more details. Even better, if you find you don't want that file any more, you can immediately use KDirStat's cleanup actions and delete it (or compress it or whatever).

Thus, the treemap can help you pinpoint large files that are hidden deep within a directory tree. You can see the entire tree at once in the treemap, so you can easily identify large files.

On the other hand, the traditional tree view (the list) is far better at showing accumulated sizes for entire directory branches. The combination of both is what makes a really powerful tool: Both views are slaved to one another, so if you select a file or directory in the treemap, it is automatically located and selected in the tree view - and vice versa.

The authors of SequoiaView of the Technical University Eindhoven in the Netherlands kindly made their papers about treemaps available to the public (thanks again, folks!). Those papers were the base for this treemap implementation.

As the treemap implementation evolved (hence the version number 2.3.3 - 2.3.0, 2.3.1, and 2.3.2 existed briefly, but I didn't have a chance to release them to the public - believe it or not, no Internet access where I wrote that), there were several variants of treemaps:
  • Plain treemaps.

    Simple to implement, but you get a lot of very thin, elongated rectangles that are hard to point at with the mouse and even harder to compare their sizes against each other.

Plain treemap
  • Squarified treemaps.

    It took me a while to figure out the layout algorithm (even though described by Mark Bruls, Kees Huizing and Jarke J. van Wijk from the TU Eindhoven) and implement it, but the result is well worth the effort: The resulting treemap rectangles (tiles) are much more square-like. There are very few of those thin, elongated rectangles left.

    On the downside, very few hints about the directory structure are left. Where simple treemaps simply change direction after each subdivision, the only visual hint you get with squarified treemaps is that larger rectangles are clustered at the top left corner of the area that belongs to a directory. Some more hints were needed.

Squarified treemap
  • Squarified cushion treemaps.

    Those additional hints come in the shape of "cushion" rendering as described by the authors of SequoiaView in one of their papers. This is an elegantly simple 3D-rendering algorithm that gives each treemap tile the apperance of a lighted "bump".

    At first, I was really reluctant to introduce that into KDirStat - I had feared it would take way too long to calculate that for each of those treemap tiles. Anybody remember the first experimental treemaps in kdirstat-1.7.x? Remember how that cushion rendering took about 20-30 seconds on average (PIII-500 class) machines? I didn't want that. I wanted KDirStat to remain fast and responsive.

    Well, it is. Creating a treemap never took more than fractions of a second on the same class of machine (Athlon-550) - too fast to be measured reliably. The key were several steps of optimizations:

    • Use QCanvas as the base widget class. This thing is real fast.
    • Render the image into a QImage rather than immediately into a QPixmap.
    • Cache those QPixmaps for redisplay.
    • Omit tiny files. Only files or directories that result in tiles with at least 3 pixels in each direction are taken into account. This optimization gets rid of a great many tiny files that are irrelevant for the purpose of KDirStat.
Squarified cushion treemap

Since there is so much room for experimenting, I chose to make all those combinations available to the user. If you don't like the treemap settings, tweak them to meet your personal preferences:

Treemap configurations

More details at the treemap features section.

Other Improvements of this Version

  • Some new configuration options:
    • Cross file system boundaries.

      Normally, KDirStat does not continue reading mounted file systems if it finds a mount point. This is very useful if you want to find out why your root file system is full and even more if you have a lot of NFS mounts (your sysadmin will kill you if you scan all the network servers - network performance will go to hell right away). Sometimes, however, it may be useful to add up everything, including mounted file systems. This option can toggle that.

      BTW you can of course always explicitly select "continue reading at mount point" if you want just one single mounted file system to be read.

    • Turn off KDirStat's internal optimized local directory reading methods.

      Not only are those methods a whole lot faster than KDE's network transparent KIO methods, they are a basic requirement to restrict directory scans to one file systems - KIO won't tell what device a directory is on.

      This option is mainly useful for benchmarking KIO against KDirStat's methods that use the readdir() / lstat() system calls.

    • Turn off the P@cM@n animation in the tool bar.

      There were some people who hadn't liked that.

    • Turn on P@cM@n in the tree view while directories are read

      This "P@cM@n armada" mode eats quite some performance, but if you have the computing power, you can have some more fun watching KDirStat as it works. BTW this mode had always been there in the code - it had simply not been accessible from the outside.

  • "Open URL" in the "File" menu.

    This opens an URL requester dialog where you can enter generic URLs that may include ftp:, smb:, tar: etc. protocols - everything supported by KDE's KIO-slaves. Formerly, you had to specify those on KDirStat's command line. It may be a little-known fact that KDirStat can also scan FTP or Samba servers. Now that is a little easier.

  • Changed the tarball format from Gzip to Bzip2.

    This saves 200 k download size: kdirstat-2.3.3.tgz was 834 k big, kdirstat-2.3.3.tar.bz2 is only 630 k. Bzip2 should now be available on all platforms that can run KDE. If you are unfamiliar with Bzip2: Simply use
    tar xjvf kdirstat-2.3.3.tar.bz2
    rather than
    tar xzvf kdirstat-2.3.3.tgz
    (with GNU tar - which is the best anyway) to unpack the archive.

Plans for the Future

Before we are heading for KDirStat-2.4.0 (the next version intended for use in production environments), I want to make sure that stability and resource consumption hasn't suffered because of all the changes due to treemaps. The prime objective for a "stable" (even numbered, i.e 2.4.x) release is to make sure it is worth that predicate.

As of features, I'd like to make treemaps MIME-type aware before the next stable release: Treemap tiles should be colored according to the type of the corresponding file. If you have lots of MP3s or PNGs or object files on your hard disk, you should be able to tell that from how many blue or yellow or whatever rectangles you see. Of course this will be user configurable to a high degree.

But what bugs me most is how to make that really efficient - KDE's MIME type recognition is really great, but it can require a lot of performance: KDE doesn't just look at filename extensions, it also looks into the file to look for magic numbers in headers etc.

Maybe this will be a simple implementation: Just according to filename extensions, no magic numbers. But on the other hand, I'd like to use the host of MIME types that KDE already knows - all files classified as "application/audio" (MP3, WAV, whatever), all files classified as "application/image" (PNG, JPG, GIF, TIF, BMP, ...) etc.

Maybe it will be a hyprid approach: Simply filename extension lookup for small files (i.e., small resulting treemap tiles), exact but expensive full-fledged MIME type recognition for larger ones. Maybe KDirStat should build its own internal cache for those expensive operations so they are not required all over again just because a directory has to be re-read or because the treemap is rebuilt - which it is very frequently: Complete rebuilds are necessary for all kinds of changes, otherwise the treemap would get "holes" where deleted files have been.

Stay tuned.

P.S. Comments are welcome, as always.

2002-10-23 New RPM for RedHat 8.0

Daniel Tschan kindly provided a KDirStat 2.2.0 RPM for RedHat Linux 8.0:

kdirstat-2.2.0-1.i386.rpm (RedHat only!)


2002-08-08 Current Development Status

I am in the process of reimplementing treemaps for KDirStat.

The CVS version already has a (albeit pretty experimental) version of that,

KDirStat with new (experimental) treemaps

but:

  • It's a resource hog since it uses a QFrame derived widget for each treemap tile (and there are lots of those).
  • It isn't exactly pretty - only grey frames.
  • No geometry management yet - the treemap will remain at a predefined fixed size.
  • It leaves dangling pointers when using some cleanups, resulting in a core dump when clicking on that thing after using such a cleanup action.

On the pro side, communication between the traditional tree view and the tree map view works pretty well: If you click on a tile in the treemap, the corresponding item in the tree view is selected, scrolled to be visible and opened, closing all previously opened items. This makes it pretty easy to find large single files, even if they are nested deeply within the directory hierarchy.

What I intend to do now is rework that part to make more efficient use of system resources (maybe QCanvas is the way to go), reintroduce pretty 3D shading of some kind and use different colors to visualize different types of files so you can see if it's your MP3 collection, images from the web, core dumps or something else that uses your disk space.

Oh yes, and I flatly refuse to recycle any of the old treemap code. It cost me too much precious time already trying to make any sense of that, and there are aesthetical reasons as well (remember, this is a purely private fun project, and without the fun part there isn't too much left...).


2002-08-08 New RPMs for RedHat 7.3

Daniel Tschan kindly provided KDirStat 2.2.0 RPMs for RedHat Linux 7.3 along with a spec file for RedHat Linux:

RPM: kdirstat-2.2.0-1.i386.rpm
Source RPM: kdirstat-2.2.0-1.src.rpm
Spec file (for RedHat only!): kdirstat.spec

News Archive


Getting KDirStat

Get the latest tarball from the download area and read the build instructions.

Usually there is also an RPM package that runs on the latest SuSE Linux distribution, so you don't need to bother building KDirStat yourself if you are using SuSE Linux.

For RedHat RPMs, you can look at the Guru Labs download pages - they may or may not have the latest version available as RPMs. If they don't have the latest version yet, please be patient - it may be coming soon.



KDirStat CVS

The KDirStat CVS is up and running at SourceForge Logo

Both read/write access for the development team and public (anonymous) read-only access available.


Public CVS Access

For those who need to be on the bleeding edge of development, there is public CVS access kindly hosted at SourceForge Logo


Features

Display Features

Treemap Display

Directory Reading

Cleaning up

Misc