Re: ArrayIndexOutOfBoundsException: -1 stack periodically occurs

From:

"phillip.s.powell@gmail.com" <phillip.s.powell@gmail.com>

Newsgroups:

comp.lang.java.help

Date:

16 Mar 2007 12:24:23 -0700

Message-ID:

<1174073062.980633.294770@e1g2000hsg.googlegroups.com>

On Mar 16, 12:23 pm, "phillip.s.pow...@gmail.com"
<phillip.s.pow...@gmail.com> wrote:

On Mar 16, 12:15 pm, Tom Hawtin <use...@tackline.plus.com> wrote:

phillip.s.pow...@gmail.com wrote:

I read throughout Sun's sites, particularly the bugs db, that there
are a number of issues within JEditorPane itself inasmuch as how it
handles HTML. Unfortunately, Java seems to provide no way of cleaning
up the HTML once set using setPage() (you would think you can

setPage loads the page in the background. Practically everything to do
with Swing and threading is utterly broken.

What I suggest is loading the page contents yourself. Insert the data
into the editor pane in sections *on the EDT*.

Tom Hawtin

Would that be accomplished this way:

SwingUtilities.invokeLater(new Runnable() {
public void run() {
SimpleBrowser.this.browser.setText(cleanedHTML);
}

});

??

Sorry, but this is clearly not working, and I wonder if setText() ever
works for JEditorPane.

Here is my code:

[code]
/*
* SimpleHTMLRenderableEditorPane.java
*
* Created on March 13, 2007, 3:39 PM
*
* To change this template, choose Tools | Template Manager
* and open the template in the editor.
*/

package com.ppowell.tools.ObjectTools.SwingTools;

import java.io.*;
import java.net.*;
import javax.swing.JEditorPane;
import javax.swing.text.html.HTMLEditorKit;

/**
* A safer version of {@link javax.swing.JEditorPane}
* @author Phil Powell
* @version JDK 1.6.0
*/
public class SimpleHTMLRenderableEditorPane extends JEditorPane {

    //--------------------------- --* CONSTRUCTORS *--
---------------------------
    // <editor-fold defaultstate="collapsed" desc=" Constructors ">
    /** Creates a new instance of SimpleHTMLRenderableEditorPane */
    public SimpleHTMLRenderableEditorPane() {
        super();
    }

    /**
     * Creates a new instance of SimpleHTMLRenderableEditorPane
     * @param url {@link java.lang.String}
     * @throws java.io.IOException Thrown if an I/O exception occurs
     */
    public SimpleHTMLRenderableEditorPane(String url) throws
IOException {
        super(url);
    }

    /**
     * Creates a new instance of SimpleHTMLRenderableEditorPane
     * @param type {@link java.lang.String}
     * @param text {@link java.lang.String}
     */
    public SimpleHTMLRenderableEditorPane(String type, String text) {
        super(type, text);
    }

    /**
     * Creates a new instance of SimpleHTMLRenderableEditorPane
     * @param url {@link java.net.URL}
     * @throws java.io.IOException Thrown if an I/O exception occurs
     */
    public SimpleHTMLRenderableEditorPane(URL url) throws IOException
{
        super(url);
    }
    // </editor-fold>
    //----------------------- --* GETTER/SETTER METHODS *--
----------------------
    // <editor-fold defaultstate="collapsed" desc=" Getter/Setter
Methods ">
    /**
     * Retrieve HTML content
     * @return html {@link java.lang.String}
     */
    public String getText() {
        try {
            /**
             * I decided to use {@link java.net.HttpURLConnection} to
retrieve the
             * HTML code from the remote site instead of using
super.getText() because
             * of the HTML code return constantly being stripped to
primitive HTML
             * template formatting irregardless of the original HTML
source code
             */
            HttpURLConnection conn =
(HttpURLConnection)getPage().openConnection();
            conn.setUseCaches(false);
            conn.setDefaultUseCaches(false);
            conn.setDoOutput(false); // READ-ONLY
            BufferedReader in = new BufferedReader(
                    new InputStreamReader(
                    conn.getInputStream()));
            int data;
            StringBuffer sb = new StringBuffer();
            char[] ch = new char[512];
            while ((data = in.read(ch)) != -1) {
                sb.append(ch, 0, data);
            }
            in.close();
            conn.disconnect();
            return sb.toString();
        } catch (IOException e) {
            return super.getText(); // DEFAULT TO USING
super.getText() IF NO I/O CONNECTION
        }
    }

    /**
     * Overloaded to fix HTML rendering bug Bug ID: 4695909.
     * @param text {@link java.lang.String}
     */
    public void setText(String text) {
        // Workaround for bug Bug ID: 4695909 in java 1.4
        // JEditorPane does not handle the META tag in the html HEAD
        if (isJava14() && "text/
html".equalsIgnoreCase(getContentType())) {
            text = stripMetaTag(text);
        }
        super.setText(text);
    }
    // </editor-fold>
    //--------------------------- --* OTHER METHODS *--
--------------------------
    // <editor-fold defaultstate="collapsed" desc=" Methods ">
    /**
     * Clean HTML to remove things like <link>, <script>,
     * <style>, <object>, <embed>, and 
     * Based upon <a href="http://bugs.sun.com/bugdatabase/view_bug.do?
bug_id=4695909">bug report</a>
     */
    public void cleanHTML() {
        try {
            setText(cleanHTML(getText()));
        } catch (Exception e) {} // DO NOTHING
    }

    /**
     * Clean HTML
     * @param html {@link java.lang.String}
     * @return html {@link java.lang.String}
     */
    public String cleanHTML(String html) {
        String[] tagArray = {"<LINK", "<SCRIPT", "<STYLE", "<OBJECT",
"<EMBED", "<!--"};
        String upperHTML = html.toUpperCase();
        String endTag;
        int index = -1, endIndex = -1;
        for (int i = 0; i < tagArray.length; i++) {
            index = upperHTML.indexOf(tagArray[i]);
            endTag = "</" + tagArray[i].substring(1,
tagArray[i].length());
            endIndex = upperHTML.indexOf(endTag, index);
            while (index >= 0) {
                if (endIndex >= 0) {
                    html = html.substring(0, index) +
                            html.substring(html.indexOf(">", endIndex)
+ 1,
                            html.length());
                    upperHTML = upperHTML.substring(0, index) +
                            upperHTML.substring(upperHTML.indexOf(">",
endIndex) + 1,
                            upperHTML.length());
                } else {
                    html = html.substring(0, index) +
                            html.substring(html.indexOf(">", index) +
1,
                            html.length());
                    upperHTML = upperHTML.substring(0, index) +
                            upperHTML.substring(upperHTML.indexOf(">",
index) + 1,
                            upperHTML.length());
                }
                index = upperHTML.indexOf(tagArray[i]);
                endIndex = upperHTML.indexOf(endTag, index);
            }
        }
        // REF: http://forum.java.sun.com/thread.jspa?threadID=213582&messageID=735120
        html = html.substring(0, upperHTML.indexOf(">",
upperHTML.indexOf("</HTML")) + 1);
        // REF: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5042872
        return html.trim();
    }

    /**
     * This actually only obtains the URL; this serves as a retriever
for cleanHTML(String html)
     * @param url {@link java.net.URL}
     * @return html {@link java.lang.String}
     */
    public String cleanHTML(URL url) {
        try {
            HttpURLConnection conn =
(HttpURLConnection)url.openConnection();
            conn.setUseCaches(false);
            conn.setDefaultUseCaches(false);
            conn.setDoOutput(false); // READ-ONLY
            BufferedReader in = new BufferedReader(
                    new InputStreamReader(
                    conn.getInputStream()));
            int data;
            StringBuffer sb = new StringBuffer();
            char[] ch = new char[512];
            while ((data = in.read(ch)) != -1) {
                sb.append(ch, 0, data);
            }
            in.close();
            conn.disconnect();
            return cleanHTML(sb.toString());
        } catch (IOException e) {
            e.printStackTrace();
            return null;
        }
    }

    /**
     * Determine if java version is 1.4.
     * @return true if java version is 1.4.x....
     */
    private boolean isJava14() {
        if (System.getProperty("java.version") == null) return false;
        return System.getProperty("java.version").startsWith("1.4");
    }

    /**
     * Workaround for Bug ID: 4695909 in java 1.4, fixed in 1.5
     * JEditorPane fails to display HTML BODY when META tag included
in HEAD section.
     *
     * Code modified by Phil Powell
     *
     * <html>
     * <head>
     * <META http-equiv="Content-Type" content="text/html;
charset=UTF-8">
     * </head>
     * <body>
     * @param text html to strip.
     * @return same HTML text w/o the META tag.
     */
    private String stripMetaTag(String text) {
        // String used for searching, comparison and indexing
        String textUpperCase = text.toUpperCase();

        int indexHead = textUpperCase.indexOf("<HEAD ");
        int indexMeta = textUpperCase.indexOf("<META ");
        int indexBody = textUpperCase.indexOf("<BODY ");

        // Not found or meta not inside the head nothing to strip...
        if (indexMeta == -1 || indexMeta < indexHead || indexMeta >
indexBody) {
            return text;
        }

        // Find end of meta tag text.
        int indexHeadEnd = textUpperCase.indexOf(">", indexMeta);

        // Strip meta tag text
        return text.substring(0, indexMeta - 1) +
text.substring(indexHeadEnd + 1);
    }
    // </editor-fold>
}

[/code]

Instead if you try

browser.getText()

You will get a NullPointerException

If you try

[code]
    public void setText(String text) {
        // Workaround for bug Bug ID: 4695909 in java 1.4
        // JEditorPane does not handle the META tag in the html HEAD
        if (isJava14() && "text/
html".equalsIgnoreCase(getContentType())) {
            text = stripMetaTag(text);
        }
        System.out.println(text); // YOU WILL SEE CNN'S HTML
        super.setText(text);
        System.out.println(super.getText()); // SEE BELOW
    }
[/code]

You see only this:

<html>
  <head>

  </head>
  <body>
    <p style="margin-top: 0">

    </p>
  </body>
</html>