at.knowcenter.wag.egov.egiz.pdf
Class Placeholder

java.lang.Object
  extended by at.knowcenter.wag.egov.egiz.pdf.Placeholder

public abstract class Placeholder
extends Object

Helper class that provides functionality for dealing with placeholders and replacements in pdf.

Author:
wprinz

Constructor Summary
Placeholder()
           
 
Method Summary
static byte[] applyURLEncoding(String text)
          Applies the URL encoding to the text.
static byte[] applyWinAnsiEncoding(String text)
          Prepares the given String to a byte array that can be substituted into the placeholder.
protected static boolean canBreakAfter(byte character)
          Tells, if a break can occur behind the given character.
static byte[] escapeByte(byte data)
          Escapes the data byte if necessary.
static byte[] escapePDFString(byte[] data)
          Escapes the String to be a suitable Literal String..
protected static byte[] escapeToken(byte[] token)
           
protected static boolean isURLEncoded(String text)
          Checks the presence of typical URL encoded characters to tell if the string is URL encoded.
static List parseStrings(byte[] pdf, int stream_start, int stream_next)
          Scans the given PDF content stream for literal PDF strings.
protected static byte[] readToken(byte[] bytes, int index)
           
static String reconstructStringFromPartition(byte[] pdf, List sis, byte[] enc)
          Reconstructs the string from a partition of placeholders.
static void replacePlaceholderWithTolerance(byte[] pdf, List sis, byte[] replace_bytes, int tolerance)
          Replaces the placeholder with the given String breaking lines with a given tolerance.
static String unapplyURLEncoding(String winansi_str)
          Unapplies the WinAnsi and URL encoding.
static String unapplyWinAnsiEncoding(byte[] replace_bytes)
          Unapplies the WinAnsi encoding.
static byte[] unescapePDFString(byte[] data)
          Unescapes the PDF String.
static String unprepareAndUnescapeString(byte[] pdf_string)
          Restores the String from a previously prepared byte array.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Placeholder

public Placeholder()
Method Detail

escapePDFString

public static byte[] escapePDFString(byte[] data)
Escapes the String to be a suitable Literal String..

Parameters:
data - The String to be escaped.
Returns:
Returns the escaped PDF String.

unescapePDFString

public static byte[] unescapePDFString(byte[] data)
Unescapes the PDF String.

Parameters:
data - The escaped String.
Returns:
Returns the unescaped String.

reconstructStringFromPartition

public static String reconstructStringFromPartition(byte[] pdf,
                                                    List sis,
                                                    byte[] enc)
                                             throws IOException
Reconstructs the string from a partition of placeholders.

Parameters:
pdf - The PDF to read the string from.
sis - The list of StringInfo objects that specify the bytes of the string in the pdf.
Returns:
Returns the extracted and reconverted string.
Throws:
IOException - Forwarded exception.

applyWinAnsiEncoding

public static byte[] applyWinAnsiEncoding(String text)
Prepares the given String to a byte array that can be substituted into the placeholder.

Parameters:
text - The text to be prepared for substitution.
Returns:
Returns the prepared byte array.

unapplyWinAnsiEncoding

public static String unapplyWinAnsiEncoding(byte[] replace_bytes)
Unapplies the WinAnsi encoding.

Parameters:
replace_bytes - The bytes.
Returns:
Returns the decoded String.

applyURLEncoding

public static byte[] applyURLEncoding(String text)
Applies the URL encoding to the text.

Parameters:
text - The text
Returns:
Returns the URL and WinAnsi encoded text.

unapplyURLEncoding

public static String unapplyURLEncoding(String winansi_str)
Unapplies the WinAnsi and URL encoding.

Parameters:
winansi_str - The Winansi and URL text.
Returns:
Returns the decoded text.

unprepareAndUnescapeString

public static String unprepareAndUnescapeString(byte[] pdf_string)
Restores the String from a previously prepared byte array.

Parameters:
pdf_string - The byte array.
Returns:
Returns the unprepared String.

isURLEncoded

protected static boolean isURLEncoded(String text)
Checks the presence of typical URL encoded characters to tell if the string is URL encoded.

This heuristic checks if there are any non URL encoded characters in the String, like ASCII control characters, which aren't allowed in the URLEncoding characterset.

Parameters:
text - The text under suspicion.
Returns:
Returns true if the String is URL encoded, false otherwise.

canBreakAfter

protected static boolean canBreakAfter(byte character)
Tells, if a break can occur behind the given character.

Parameters:
character - The character.
Returns:
Returns true, if a break may occur behind the character, false otherwise.

parseStrings

public static List parseStrings(byte[] pdf,
                                int stream_start,
                                int stream_next)
Scans the given PDF content stream for literal PDF strings.

Parameters:
pdf - The PDF.
stream_start - The start of the content stream to be scanned.
stream_next - The end of the content stream.
Returns:
Returns a list of StringInfo objects specifying the strings that could be found.

escapeByte

public static byte[] escapeByte(byte data)
Escapes the data byte if necessary.

Before bytes can be written into the pdf Strings, they have to be escaped. Special care has to be taken that escaped sequences are not split due to line breaks. This could have fatal consequences and usually renders the whole document invalid.

Parameters:
data - The data byte to be escaped.
Returns:
Returns a new byte array escaping the data byte. If the byte needs not to be escaped, this new array will contain only the original data byte.

replacePlaceholderWithTolerance

public static void replacePlaceholderWithTolerance(byte[] pdf,
                                                   List sis,
                                                   byte[] replace_bytes,
                                                   int tolerance)
                                            throws PDFDocumentException
Replaces the placeholder with the given String breaking lines with a given tolerance.

Parameters:
pdf - The PDF.
sis - The list of StringInfo objects describing the positions where the String should be filled in.
replace_bytes - The unescaped bytes to be filled in. Escaping is performed by this method.
tolerance - The tolerance for line wrapping. The tolerance counts from the end of a StringInfo backwards to its start. If a word that starts within the tolerance doesn't fit, it is wrapped into the next line.
Throws:
PDFDocumentException - Forwarded exception.

readToken

protected static byte[] readToken(byte[] bytes,
                                  int index)

escapeToken

protected static byte[] escapeToken(byte[] token)
                             throws IOException
Throws:
IOException


Copyright © 2006-2007 EGIZ - E-Government Innovationszentrum. All Rights Reserved.