BZip2TF

Compress a file using the bzip2 algorithm.

Task package:org.at4j
Java package:org.schmant.task.at4j.bzip2
Category:I/O tasks
Since:0.8
EntityFS-aware?Yes*
Implements:ActionTaskFactory
GeneratorTaskFactory
ProcessTaskFactory
Produces:The interpreted target property. The exact type depends on the type used for the target.
See also:GZipTF
LzmaTF

Description:

Compress a file using the bzip2 algorithm.

The target property is optional if the source file is of a type that implements Named. If the target is not set, the compressed data will be written to a file with the same name as the source file plus the extension .bz2.

Note: If another task created the file to compress, it may be easier to use a BZip2NewWritableFileProxy directly with that task instead instead of using a separate bzip2 task for the compression.

Note 2: There is no task for decompressing bzip2 compressed files. Use a BZip2ReadableFile or a BZip2ReadableFileProxy instead.

Required properties

Properties

blockSizetop

The block size used for the bzip2 compression, in hundreds of kilobytes. A higher block size gives more efficient compression, but a higher memory usage for compression and decompression.

Setter method:
setBlockSize(int size)
parameters:
size – The block size, in hundreds of kilobytes. This can be a value between 1 and 9, inclusive.
Default value:
9 for a block size of 900kb.
compressionLeveltop

The compression level. Setting this property will modify the blockSize property. CompressionLevel.BEST gives a block size of 900 kbytes and CompressionLevel.FASTEST gives a block size of 100 kbytes.

Setter method:
setCompressionLevel(CompressionLevel level)
parameters:
level – The compression level
deleteSourceFiletop

If this property is set to true, the source file will be deleted after compressing it. This requires that the source file is a Java File or an EntityFS EFile. If it is not, this task does not delete the file and logs a warning.

Setter method:
setDeleteSourceFile(boolean b)
parameters:
b – Should the source file be deleted?
executorServicetop

Several bzip2 compressors (tasks and writable files) can share a BZip2EncoderExecutorService object to share a set of compressor threads.

A BZip2EncoderExecutorService is created by calling the BZip2OutputStream.createExecutorService(int) method.

After using it, the executor service has to be shut down by calling its shutdown() method. If the executor service is used together with tasks that are scheduled in a TaskExecutor, the task executor has to be shut down before the executor service to ensure that the scheduled bzip2 tasks have been run.

If this property is set, the numberOfEncoderThreads property is ignored.

Setter method:
setExecutorService(BZip2EncoderExecutorService es)
parameters:
es – The executor service.
See also:
numberOfEncoderThreads
logFootertop

The message that is logged to info level after the task has been successfully run.

Setter method:
setLogFooter(String s)
parameters:
s – The footer message.
Default value:
Empty (no footer message is logged.)
See also:
logHeader
logHeadertop

The message that is logged to info level before the task is run.

Setter method:
setLogHeader(String s)
parameters:
s – The header message.
Default value:
A task class specific message.
See also:
logFooter
numberOfEncoderThreadstop

Set the number of block encoder threads to use for the bzip2 compression. Since bzip2 compression is CPU-intensive, setting this property to the number of available CPU:s will increase the compression speed significantly. The drawback is a higher memory consumption.

Setter method:
setNumberOfEncoderThreads(int no)
parameters:
no – The number of encoder threads. If this is set to 0, the compression is done in the thread that runs the bzip2 task.
Default value:
0 (single-threaded encoding).
See also:
executorService
numberOfHuffmanTreeRefinementIterationstop

The number of Huffman tree refinement iterations that are run when bzip2 compressing data. A higher number here gives a better compression, but also a longer compression time.

Setter method:
setNumberOfHuffmanTreeRefinementIterations(int no)
parameters:
no – The number of Huffman tree refinement iterations.
Default value:
5
overwriteStrategytop

The overwrite strategy decides how the task will react if there already is an entity (file or directory) in a location where it wants to create a new entity.

If the strategy is to not overwrite existing entities, the task will fail when it cannot create the entities that it wants to create.

Non-empty directories are never overwritten, regardless of the chosen strategy.

Setter method:
setOverwrite(boolean b)
Setting this to a value of true means that the DoOverwriteAndLogWarning strategy is used. A value of false gives the DontOverwriteAndThrowException strategy.
parameters:
b – Should an existing entity be overwritten?
Setter method:
setOverwriteStrategy(OverwriteStrategy strat)
Set the overwrite strategy.
parameters:
strat – The overwrite strategy.
Default value:
DontOverwriteAndThrowException
See also:
target
reportLeveltop

This property is used to change the Report level for all task created by this task factory. The report level is changed for the thread running the task when the it is run, and is restored to its previous level when the it is done.

Setter method:
setReportLevel(Level l)
Set the report level
parameters:
l – The new report level.
source (required)top

The file to compress.

Setter method:
setSource(Object o)
parameters:
o –  The source file.
Interpreted by InterpretAsReadableFileStrategy.
target (required)top

The file that the compressed data should be put in.

Setter method:
setTarget(Object o)
Set the target.
parameters:
o – A target that can be interpreted as an existing or non-existing file. If the target is an existing file or directory, the overwriteStrategy property determines what will happen.
Interpreted by |ai:ai_new_writable_file;InterpretAsNewWritableFileStrategy.
See also:
overwriteStrategy
traceLoggingtop

If trace logging is enabled for a task, it reports its configuration before it is run.

Trace logging may also be enabled globally for all tasks by calling TraceMode.setTraceMode(boolean).

Setter method:
setTraceLogging(boolean b)
Enable or disable trace logging.
parameters:
b – Enable trace logging?
useCommonsCompressImplementationtop

Use Apache Commons Compress bzip2 implementation instead of At4J's.

If this property is set, many of the other properties of this task are ignored.

Setter method:
setUseCommonsCompressImplementation(boolean b)
parameters:
b – Should Apache Commons Compress' bzip2 implementation be used.
Default value:
false (At4J's implementation is used).

Examples

This TarTF example shows how the BZip2NewWritableFileProxy is used to compress the Tar archive while it is being created.

Example 1

Compress the file f in the directory dir to f.bz2 in the same directory.

// Enable the task package in the script header // enableTaskPackage org.at4j import org.at4j.comp.CompressionLevel import org.entityfs.util.Directories import org.schmant.support.FutureFile import org.schmant.task.at4j.bzip2.BZip2TF new BZip2TF(). // Use maximum compression. // (...which happens to be the default, so we would not have to set this) setCompressionLevel(CompressionLevel.BEST). setSource(Directories.getFile(dir, "f")). setTarget(new FutureFile(dir, "f.bz2")).run()

Example 2

Compress the files f1 and f2 in the directory d to f1.bz2 and f2.bz2 in the same directory. Use an bzip2 encoder executor service to spread the encoding over as many threads as there are available CPU:s.

// Enable the task package in the script header // enableTaskPackage org.at4j import org.at4j.comp.bzip2.BZip2OutputStream import org.entityfs.util.Directories import org.schmant.run.TaskExecutor import org.schmant.support.FutureFile import org.schmant.task.at4j.bzip2.BZip2TF // Create the executor service. There are two static methods on // BZip2OutputStream for this. The one without arguments creates an executor // service that will use one thread for each available CPU. def executorService = BZip2OutputStream.createExecutorService() try { // Use a task executor to run the two bzip2 tasks in parallel. Otherwise they // would not be able to share the threads of the executor service. Using an // executor service would still give a performance improvement for each task, // though, even if they were not run in parallel. def te = new TaskExecutor(). setNumberOfThreads(2). start() try { // Don't set a target file. The default is to create a target file with the // source file name and the extension .bz2 te.add(new BZip2TF(). setSource(Directories.getFile(d, "f1")). // Delete the source file setDeleteSourceFile(true). setExecutorService(executorService)) te.add(new BZip2TF(). setSource(Directories.getFile(d, "f2")). // Delete the source file setDeleteSourceFile(true). setExecutorService(executorService)) te.waitFor() } finally { // Shut down the task executor. // This must be shut down BEFORE the executor service to ensure that the // bzip2 tasks have actually had a chance to encode their data before the // executor service is shut down. te.shutdown() } } finally { // Shut down the executor service to release its resources. executorService.shutdown() }


* An EntityFS-aware task is implemented using EntityFS. This means that it uses the filter settings of DirectoryView:s and also that it often can work with other file system implementations than File-based, such as the RAM file system.