Re: Splitting a String with a Regex
Would you please try this one?
public class MultiXMLSplit {
private static final String xmlStr =
"<?xml><root>hello1</root><?xml><root>hello2</root><?xml><root>hello3</root>";
public static void main(String[] args) {
int index1 = xmlStr.indexOf("<?xml");
int index2;
while (index1 != -1 && index1 < xmlStr.length() - 1) {
index2 = xmlStr.indexOf("<?xml", index1 + 1);
if (index2 != -1 && index2 < xmlStr.length()) {
System.out.println(xmlStr.substring(index1, index2));
} else break;
index1 = index2;
}
// Deal with the last xml doc
if (index1 != -1 && index1 < xmlStr.length() - 1)
System.out.println(xmlStr.substring(index1));
}
}
Maybe you should add more codes to trim the space chars at the head of each
XML document text. As I known, if an xml document text starts with space
chars, the xml parser will not parse it correctly. You will get error
messages like this:
The processing instruction target matching "[xX][mM][lL]" is not
allowed.
<stevengarcia@yahoo.com> wrote in message
news:1146167109.573578.100160@g10g2000cwb.googlegroups.com...
I have multiple root XML documents in a String that looks like
"<?xml...><response .../><?xml...><response .../><?xml...><response
.../>"
There are three valid XML documents above, unfortunately I have all of
them in one String so (as far as I can tell) XML parsing with dom4j
will not give me three Document objects.
I am trying to write a method that will split the above String into
three separate strings that are all valid XML, and can be parsed by an
XML parser. First I tried String.split()...but there is no good
delimiter. Then I tried writing a regular expression, and I think
regex's will work here, but I'm not proficient at this advanced topic.
The other thing too is the real XML has carriage feeds and other random
characters between each XML document. The XML within each document is
assured to be valid, however.
Is a regex a good way to do this? Your help would be appreciated.