Fully Utilizing Your X-Core CPU
Almost all systems sold nowadays have at least a dual-core CPU, even triple- or quad-cores are getting cheaper and getting standard in the near future. But how to utilize your shiny x-core to it's full potential, with applications that are only utilizing one core ? With Linux, which has strong multitasking capabilities as all unixoid operating systems, there is an easy possibility to parallelize tasks which are normally only using one core of an x-core CPU.
You should be familiar working in the shell, because all work is done there. No browser, no fancy GUI's, no eyecandy.
Disclaimer
The following article describes the way I installed and used the software, I do not issue any guarantee that the same way works for you.
1. Parallel Processing Shell Script
This script is hosted on http://code.google.com/p/ppss/. It is released under BSD licensing terms, changing to GPL with version 2.55. It is not packaged in a repository of the popular Linux distributions, so it has to be installed manually.
There is a full, detailed documentation on the hosting site and the script could be downloaded from there in form of an archive. As there is only the script in the archive I unpacked the archive to /usr/local/bin, checked that it's belonging to root and is world read- and executable:
wget http://ppss.googlecode.com/files/ppss-2.50.tgz
tar xvzf ppss-2.50.tgz -C /usr/local/bin
chown root:root /usr/local/bin/ppss.sh && chmod a+rx /usr/local/bin/ppss.sh
ls -l /usr/local/bin/ppss.sh
-rwxr-xr-x 1 root root 54612 2009-12-17 16:40 /usr/local/bin/ppss.sh*
Requirements other than bash is only the mkfifo command. mkfifo is usually in the GNU coreutils package. As we are installing without a package manager, it's obvious that you manually have to take care that all dependencies are resolved. ppss.sh runs not only on Linux, but also on Mac OS X, may be also on *BSD.
When you try to call it without any arguments it should show it's help screen:
$ ppss.sh |P|P|S|S| Distributed Parallel Processing Shell Script 2.50 usage: /usr/local/bin/ppss.sh [ -d| -f ] [ -c ' "$ITEM"' ] [ -C ] [ -j ] [ -l ] [ -p <# jobs> ] [ -D ] [ -h ] [ --help ] Examples: /usr/local/bin/ppss.sh -d /dir/with/some/files -c 'gzip ' /usr/local/bin/ppss.sh -d /dir/with/some/files -c 'gzip "$ITEM"' -D 5 /usr/local/bin/ppss.sh -d /dir/with/some/files -c 'cp "$ITEM" /tmp' -p 2
For basic usage of ppss.sh we don't need to configure anything, as the script is smart enough to analyze how much cores the CPU of the machine it's running on, has.
2. Examples
A job perfectly suited being parallelized with ppss.sh is, when you want to transform a bunch of .flac files to for instance oggvorbis files, for listening on your portable player. This could easily be done in a for loop, and we are lucky that oggenc is capable of reading .flac files:
for i in *.flac; do oggenc -q3 -o ${i%%flac}ogg $i; done
This is the sequential approach, only one of your x-cores is used. This can be seen by issuing a
top -n1
in another window, while the conversion is running:
Tasks: 112 total, 2 running, 110 sleeping, 0 stopped, 0 zombie Cpu(s): 56.9%us, 0.6%sy, 0.0%ni, 41.2%id, 1.1%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 4062968k total, 2303836k used, 1759132k free, 36268k buffers Swap: 522072k total, 0k used, 522072k free, 1558228k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7248 root 20 0 19780 2604 1468 R 88 0.1 0:05.02 oggenc ...
Let's try to parallelize it with ppss.sh:
ppss.sh -d /path/to/your/flacfiles/ -c 'oggenc $ITEM -q3 -o "${ITEM%%.flac}.ogg"'
Instead of the loop-variable i we have to use the variable ITEM, which is preset by ppss.sh as "loop" variable. We get the following output on the screen:
Feb 03 20:16:00: ========================================================= Feb 03 20:16:00: |P|P|S|S| Feb 03 20:16:00: Distributed Parallel Processing Shell Script version 2.50 Feb 03 20:16:00: ========================================================= Feb 03 20:16:00: Hostname: xxxxx Feb 03 20:16:00: --------------------------------------------------------- Feb 03 20:16:00: CPU: AMD Athlon(tm) Dual Core Processor 4850e Feb 03 20:16:00: Found 2 logic processors. Feb 03 20:16:00: Starting 2 parallel workers. Feb 03 20:16:00: --------------------------------------------------------- Feb 03 20:17:00: Currently 100 percent complete. Processed 7 of 7 items. Feb 03 20:17:00: 1 job is remaining. Feb 03 20:17:34: Finished. Consult ./ppss/job_log for job output.
And another
top -n1
shows that all cores are utilized:
Tasks: 117 total, 3 running, 114 sleeping, 0 stopped, 0 zombie Cpu(s): 99.4%us, 0.6%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4062968k total, 2070004k used, 1992964k free, 38916k buffers Swap: 522072k total, 0k used, 522072k free, 1709844k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 10933 root 20 0 19856 2600 1468 R 99 0.1 0:06.20 oggenc 10854 root 20 0 19856 2600 1468 R 87 0.1 0:08.51 oggenc ...
Another example would be to convert the same .flac files to .mp3. This is a bit more complicated, because we need flac to decode the .flac files and lame for encoding them. This could be done sequential in a for loop, pipeing stdout of flac to stdin of lame:
for file in *.flac; do $(flac -cd "$file" | lame -h - "${file%.flac}.mp3"); done
Another
top -n1
shows that only one core is used:
Tasks: 118 total, 3 running, 115 sleeping, 0 stopped, 0 zombie Cpu(s): 99.3%us, 0.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4062968k total, 2080004k used, 1982964k free, 37926k buffers Swap: 522072k total, 0k used, 522072k free, 1709844k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 17837 root 20 0 19856 2600 1468 R 99 0.1 0:04.70 oggenc ...
The same parallelized by ppss.sh:
ppss.sh -d /path/to/flacfiles -c 'flac -cd "$ITEM" | lame -V 4 -B 160 - "${ITEM%.flac}.mp3"'
Here
top -n1
shows that both cores are utilized nearly to their full potential:
Tasks: 129 total, 3 running, 126 sleeping, 0 stopped, 0 zombie Cpu(s): 98.2%us, 1.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.2%hi, 0.0%si, 0.0%st Mem: 4062968k total, 2629956k used, 1433012k free, 33996k buffers Swap: 522072k total, 0k used, 522072k free, 1890244k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7287 root 20 0 14788 2480 1124 R 93 0.1 0:13.41 lame 7210 root 20 0 14788 2484 1124 R 89 0.1 0:23.59 lame 7209 root 20 0 15776 1060 792 S 7 0.0 0:01.67 flac 7286 root 20 0 15776 1064 792 S 6 0.0 0:00.95 flac
A last example (dealing with soundfiles) would be to downsample some .mp3 files encoded with 320kBit to 128kBit VBR, again to save precious space on the portable player. This could be done with lame, without the help of a decoder like in the previous example:
for i in *.mp3; do lame -V 4 -B 160 - "${i%.flac}.mp3"'; done
Parallelizing this with ppss.sh looks like
ppss.sh -d /path/to/320kBit/mp3_files -c 'lame $ITEM -V 4 -B 160 "${ITEM%.mp3}.downmp3"'
and looks like
Tasks: 132 total, 3 running, 129 sleeping, 0 stopped, 0 zombie Cpu(s): 98.8%us, 1.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 4062968k total, 2441644k used, 1621324k free, 37068k buffers Swap: 522072k total, 0k used, 522072k free, 1671080k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9073 root 20 0 14792 2724 1248 R 98 0.1 0:10.80 lame 8994 root 20 0 14788 2720 1248 R 98 0.1 0:15.57 lame
As we do not like to overwrite the 320kBit files, we simply give the downsampled files the extension .downmp3. After the downsampling is done, you can move the downsampled files out of the way with mmv:
mmv "*.downmp3" "/new/path/#1.mp3"
So you rename the files and move them elsewhere with one command. mmv, by the way, should be in the repositories of the most common distributions.
Another feature of ppss.sh is that you can use a file to provide it with the items to process. Say you have a directory with a lot of pictures, and you first want to view all pictures with feh as image viewer and use feh's filelist feature to pick some files for later processing. The processing should be done with ImageMagick, which has a lot of features for filtering and manipulating images. As manipulation of images could be cpu intensive, this is perfectly suited for ppss.sh.
A filelist of feh looks like
/path/to/images/1.png /path/to/images/3.png /path/to/images/17.png ... /path/to/images/13815.png
It is produced with the -f filelist commandline parameter. You can delete an image from the filelist by pressing the <DELETE> key while in feh. Unfortunately absolute paths cannot be used with ppss.sh, but feh itself could be used to convert it's filelist to one suitable for ppss.sh:
feh *.png -f feh_list -L %n > ppss_list
or awk:
cat feh_list | awk -F/ '{ print $NF }' > ppss_list
Say you want to do some real timeconsuming filtering with all these selected files. With ImageMagick you cannot use filelists of feh, if you want to process a whole directory you can use the above mentioned for loop method. With feh you can also build recursive filelists, which could not be processed within a simple for loop with ImageMagick. But back to the example, now that you have a filelist you do the nifty filtering like
ppss.sh -f ppss_list -d /path/to/images -c 'mogrify -solarize 50 "$ITEM"'
(I only do a solarize, but you get the idea).
On the screen you see ppss working:
Feb 03 15:06:06: ========================================================= Feb 03 15:06:06: |P|P|S|S| Feb 03 15:06:06: Distributed Parallel Processing Shell Script version 2.50 Feb 03 15:06:06: ========================================================= Feb 03 15:06:06: Hostname: xxxxxxxx Feb 03 15:06:06: --------------------------------------------------------- Feb 03 15:06:06: CPU: Intel(R) Core(TM)2 Duo CPU T5470 @ 1.60GHz Feb 03 15:06:06: Found 2 logic processors. Feb 03 15:06:06: Starting 2 parallel workers. Feb 03 15:06:06: --------------------------------------------------------- Feb 03 15:06:17: Currently 100 percent complete. Processed 65 of 65 items. Feb 03 15:06:17: 1 job is remaining. Feb 03 15:06:17: Finished. Consult ./ppss/job_log for job output.