summaryrefslogtreecommitdiff
path: root/xmerge/java/org/openoffice/xmerge/converter/xml/sxw/aportisdoc/package.html
diff options
context:
space:
mode:
Diffstat (limited to 'xmerge/java/org/openoffice/xmerge/converter/xml/sxw/aportisdoc/package.html')
-rw-r--r--xmerge/java/org/openoffice/xmerge/converter/xml/sxw/aportisdoc/package.html237
1 files changed, 0 insertions, 237 deletions
diff --git a/xmerge/java/org/openoffice/xmerge/converter/xml/sxw/aportisdoc/package.html b/xmerge/java/org/openoffice/xmerge/converter/xml/sxw/aportisdoc/package.html
deleted file mode 100644
index 78cfe79bfbbf..000000000000
--- a/xmerge/java/org/openoffice/xmerge/converter/xml/sxw/aportisdoc/package.html
+++ /dev/null
@@ -1,237 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
-<!--
-
- DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
-
- Copyright 2000, 2010 Oracle and/or its affiliates.
-
- OpenOffice.org - a multi-platform office productivity suite
-
- This file is part of OpenOffice.org.
-
- OpenOffice.org is free software: you can redistribute it and/or modify
- it under the terms of the GNU Lesser General Public License version 3
- only, as published by the Free Software Foundation.
-
- OpenOffice.org is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU Lesser General Public License version 3 for more details
- (a copy is included in the LICENSE file that accompanied this code).
-
- You should have received a copy of the GNU Lesser General Public License
- version 3 along with OpenOffice.org. If not, see
- <http://www.openoffice.org/license.html>
- for a copy of the LGPLv3 License.
-
--->
-<html>
-<head>
-<title>org.openoffice.xmerge.converter.xml.sxw.aportisdoc package</title>
-</head>
-
-<body bgcolor="white">
-
-<p>Provides the tools for doing the conversion of StarWriter XML to
-and from AportisDoc format.</p>
-
-<p>It follows the {@link org.openoffice.xmerge} framework for the conversion process.</p>
-
-<p>Since it converts to/from a Palm application format, these converters
-follow the <a href=../../../../converter/palm/package-summary.html#streamformat>
-<code>PalmDB</code> stream format</a> for writing out to the Palm sync client or
-reading in from the Palm sync client.</p>
-
-<p>Note that <code>PluginFactoryImpl</code> also provides a
-<code>DocumentMerger</code> object, i.e. {@link org.openoffice.xmerge.converter.xml.sxw.aportisdoc.DocumentMergerImpl DocumentMergerImpl}.
-This functionality was derived from its superclass
-{@link org.openoffice.xmerge.converter.xml.sxw.SxwPluginFactory
-SxwPluginFactory}.</p>
-
-<h2>AportisDoc pdb format - Doc</h2>
-
-<p>The AportisDoc pdb format is widely used by different Palm applications,
-e.g. QuickWord, AportisDoc Reader, MiniWrite, etc. Note that some
-of these applications put tweaks into the format. The converters will only
-support the default AportisDoc format, plus some very minor tweaks to accommodate
-other applications.</p>
-
-<p>The text content of the format is plain text, i.e. there are no styles
-or structures. There is no notion of lists, list items, paragraphs,
-headings, etc. The format does have support for bookmarks.</p>
-
-<p>For most Doc applications, the default character encoding supported is
-the extended ASCII character set, i.e. ISO-8859-1. StarWriter XML is in
-UTF-8 encoding scheme. Since UTF-8 encoding scheme covers more characters,
-converting UTF-8 strings into extended ASCII would mean that there can be
-possible loss of character mappings.</p>
-
-<p>Using JAXP, XML files can be parsed and read in as Java <code>String</code>s
-which is in Unicode format, there is no loss of character mapping from UTF-8
-to Java Strings. There is possible loss of character mapping in
-converting Java <code>String</code>s to ASCII bytes. Java characters that
-cannot be represented in extended ASCII are converted into the ASCII
-character '?' or x3F in hex digit via the <code>String.getBytes(encoding)</code>
-API.</p>
-
-<h2>SXW to DOC Conversion</h2>
-
-<p>The <code>DocumentSerializerImpl</code> class implements the
-<code>org.openoffice.xmerge.DocumentSerializer</code>.
-This class specifically provides the conversion process from a given
-<code>SxwDocument</code> object to DOC formatted records, which are
-then passed back to the client via the <code>ConvertData</code> object.</p>
-
-<p>The following XML tags are handled. [Note that some may not be implemented yet.]</p>
-<ul>
-<li>
- <p>Paragraphs <tt>&lt;text:p&gt;</tt> and Headings <tt>&lt;text:h&gt;</tt></p>
-
- <p>Heading elements are classified the same as paragraph
- elements since both have the same possible elements inside.
- Their main difference is that they refer to different types
- of style information, which is outside of their element tags.
- Since there are no styles on the DOC format, headings should
- be treated the same way a paragraph is converted.</p>
-
- <p>For paragraph elements, convert and transfer text nodes
- that are essential. Text nodes directly contained within paragraph
- nodes are such. There are also a number of elements that
- a paragraph element may contain. These are explained in their
- own context.</p>
-
- <p>At the end of the paragraph, an EOL character is added by
- the converter to provide a separation for each paragraph,
- since the Doc format does not have a notion of a paragraph.</p>
-</li>
-<li>
- <p>White spaces <tt>&lt;text:s&gt;</tt> and Tabs <tt>&lt;text:tab-stop&gt;</tt></p>
-
- <p>In SXW, normally 2 or more white-space characters are collapsed into
- a single space character. In order to make sure that the document
- content really contains those white-space characters, there are special
- elements assigned to them.</p>
-
- <p>The space element specifies the number of spaces are in it.
- Thus, converting it just means providing the specific number of spaces
- that the element requires.</p>
-
- <p>There is also the tab-stop element. This is a bit tricky. In a
- StarWriter document, tab-stops are specified by a column position.
- A tab is not an exact number of space, but rather a specific column
- positioning. Say, regular tab-stops are set at every 5th column.
- At column 4, if I hit a tab, it goes to column 5. At column 1, hitting
- a tab would put the cursor at column 5 as well. SmartDoc and AporticDoc
- applications goes by columns for the ASCII tab character. The only problem
- is that in StarWriter, one could specify a different tab-stop, but not
- in most of these Doc applications, at least I have not seen one.
- Solution for this is just to go with the converting to the ASCII tab
- character and not do anything for different tab-stop positioning.</p>
-</li>
-<li>
- <p>Line breaks <tt>&lt;text:line-break&gt;</tt></p>
-
- <p>To represent line breaks, it is simpliest to just put an ASCII LF
- character. Note that the side effect of this is that an end of paragraph
- also contains an ASCII LF character. Thus, for the DOC to SXW conversion,
- line breaks are not distinguishable from specifying the end of a
- paragraph.</p>
-</li>
-<li>
- <p>Text spans <tt>&lt;text:span&gt;</tt></p>
-
- <p>Text spans contain text that have different style attributes
- from the paragraphs'. Text spans can be embedded within another
- text span. Since it is purely for style tagging, we only needed
- to convert and transfer the text elements within these.</p>
-</li>
-<li>
- <p>Hyperlinks <tt>&lt;text:a&gt;</tt>
-
- <p>Convert and transfer the text portion.</p>
-</li>
-<li>
- <p>Bookmarks <tt>&lt;text:bookmark&gt;</tt> <tt>&lt;text:bookmark-start&gt;</tt>
- <tt>&lt;text:bookmark-end&gt;</tt> [Not implemented yet]</p>
-
- <p>In SXW, bookmark elements are embedded inside paragraph elements.
- Bookmarks can either mark a text position or a text range. <tt>&lt;text:bookmark&gt;</tt>
- marks a position while the pair <tt>&lt;text:bookmark-start&gt;</tt> and
- <tt>&lt;text:bookmark-end&gt;</tt></p> marks a text range. The DOC format only
- supports bookmarking a text position. Thus, for the conversion,
- <tt>&lt;text:bookmark&gt;</tt> and <tt>&lt;text:bookmark-start&gt;</tt> will both mark
- a text position.</p>
-</li>
-<li>
- <p>Change Tracking <tt>&lt;text:tracked-changes&gt;</tt>
- <tt>&lt;text:change*&gt;</tt> [Not implemented yet]</p>
-
- <p>Change tracking elements are not supported yet on the current
- OpenOffice XML filters, will have to watch out on this. The text
- within these elements have to be interpreted properly during the
- conversion process.</p>
-</li>
-<li>
- <p>Lists <tt>&lt;text:unordered-list&gt;</tt> and
- <tt>&lt;text:ordered-lists&gt;</tt></p>
-
- <p>A list can only contain one optional <tt>&lt;text:list-header&gt;</tt>
- and one or more <tt>&lt;text:list-item&gt;</tt> elements.</p>
-
- <p>A <tt>&lt;text:list-header&gt;</tt> contains one or more paragraph
- elements. Since there are no styles, the conversion process does not
- do anything special for list headers, conversion for the paragraphs
- within list headers are the same as explained above.</p>
-
- <p>A <tt>&lt;text:list-item&gt;</tt> may contain one or more of paragraphs,
- headings, list, etc. Since the Doc format does not support any list
- structure, there will not be any special handling for this element.
- Conversion for elements within it shall be applied according to the
- element type. Thus, lists with paragraphs within it will result in just
- plain paragraphs. Sublists will not be identifiable. Paragraphs in
- sublists will still appear.</p>
-</li>
-<li>
- <p><tt>&lt;text:section&gt;</tt></p>
-
- <p>I am not sure what this is yet, will need to investigate more on this.</p>
-</li>
-</ul>
-<p>There may be other tags that will still need to be addressed for this conversion.</p>
-
-<p>Refer to {@link org.openoffice.xmerge.converter.xml.sxw.aportisdoc.DocumentSerializerImpl DocumentSerializerImpl}
-for details of implementation. It uses <code>DocEncoder</code> class to do the encoding
-part.</p>
-
-<h2>DOC to SXW Conversion</h2>
-
-<p>The <code>DocumentDeserializerImpl</code> class implements the
-<code>org.openoffice.xmerge.DocumentDeserializer</code>. It is
-passed the device document in the form of a <code>ConvertData</code> object.
-It will then create a <code>SxwDocument</code> object from the conversion of
-the DOC formatted records.</p>
-
-<p>The text content of the Doc format will be transferred as text. Paragraph
-elements will be formed based on the existence of an ASCII LF character. There
-will be at least one paragraph element.</p>
-
-<p>Bookmarks in the Doc format will be converted to the bookmark element
-<tt>&lt;text:bookmark&gt;</tt> [Not implemented yet].</p>
-
-
-<h2>Merging changes</h2>
-
-<p>As mentioned above, the <code>DocumentMerger</code> object produced by
-<code>PluginFactoryImpl</code> is <code>DocumentMergerImpl</code>.
-Refer to the javadocs for that package/class on its merging specifications.
-</p>
-
-<h2>TODO list</h2>
-
-<p><ol>
-<li>Investigate Palm's with different character encodings.</li>
-<li>Investigate other StarWriter XML tags</li>
-</ol></p>
-
-</body>
-</html>